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Preface 


Purpose 


The use of probability models and statistical methods for analyzing data has become 
common practice in virtually all scientific disciplines. This book attempts to provide 
a comprehensive introduction to those models and methods most likely to be encoun- 
tered and used by students in their careers in engineering and the natural sciences. 
Although the examples and exercises have been designed with scientists and engi- 
neers in mind, most of the methods covered are basic to statistical analyses in many 
other disciplines, so that students of business and the social sciences will also profit 
from reading the book. 


Approach 


Students in a statistics course designed to serve other majors may be initially skeptical 
of the value and relevance of the subject matter, but my experience is that students can 
be turned on to statistics by the use of good examples and exercises that blend their 
everyday experiences with their scientific interests. Consequently, I have worked hard 
to find examples of real, rather than artificial, data—data that someone thought was 
worth collecting and analyzing. Many of the methods presented, especially in the later 
chapters on statistical inference, are illustrated by analyzing data taken from published 
sources, and many of the exercises also involve working with such data. Sometimes 
the reader may be unfamiliar with the context of a particular problem (as indeed I 
often was), but I have found that students are more attracted by real problems with 
a somewhat strange context than by patently artificial problems in a familiar setting. 


Mathematical Level 


The exposition is relatively modest in terms of mathematical development. Substantial 
use of the calculus is made only in Chapter 4 and parts of Chapters 5 and 6. In par- 
ticular, with the exception of an occasional remark or aside, calculus appears in the 
inference part of the book only—in the second section of Chapter 6. Matrix algebra 
is not used at all. Thus almost all the exposition should be accessible to those whose 
mathematical background includes one semester or two quarters of differential and 
integral calculus. 


Content 


Chapter | begins with some basic concepts and terminology—population, sample, 
descriptive and inferential statistics, enumerative versus analytic studies, and so on— 
and continues with a survey of important graphical and numerical descriptive methods. 
A rather traditional development of probability is given in Chapter 2, followed by prob- 
ability distributions of discrete and continuous random variables in Chapters 3 and 4, 
respectively. Joint distributions and their properties are discussed in the first part of 
Chapter 5. The latter part of this chapter introduces statistics and their sampling distri- 
butions, which form the bridge between probability and inference. The next three 
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chapters cover point estimation, statistical intervals, and hypothesis testing based on a 
single sample. Methods of inference involving two independent samples and paired 
data are presented in Chapter 9. The analysis of variance is the subject of Chapters 10 
and 11 (single-factor and multifactor, respectively). Regression makes its initial 
appearance in Chapter 12 (the simple linear regression model and correlation) 
and returns for an extensive encore in Chapter 13. The last three chapters develop 
chi-squared methods, distribution-free (nonparametric) procedures, and techniques 
from statistical quality control. 


Helping Students Learn 


Although the book’s mathematical level should give most science and engineering 
students little difficulty, working toward an understanding of the concepts and gaining 
an appreciation for the logical development of the methodology may sometimes 
require substantial effort. To help students gain such an understanding and appreci- 
ation, I have provided numerous exercises ranging in difficulty from many that 
involve routine application of text material to some that ask the reader to extend 
concepts discussed in the text to somewhat new situations. There are many more 
exercises than most instructors would want to assign during any particular course, 
but I recommend that students be required to work a substantial number of them. In 
a problem-solving discipline, active involvement of this sort is the surest way to 
identify and close the gaps in understanding that inevitably arise. Answers to most 
odd-numbered exercises appear in the answer section at the back of the text. In 
addition, a Student Solutions Manual, consisting of worked-out solutions to virtu- 
ally all the odd-numbered exercises, is available. 

To access additional course materials and companion resources, please visit 
www.cengagebrain.com. At the CengageBrain.com home page, search for the ISBN 
of your title (from the back cover of your book) using the search box at the top of 
the page. This will take you to the product page where free companion resources can 
be found. 


New for This Edition 


¢ The major change for this edition is the elimination of the rejection region 
approach to hypothesis testing. Conclusions from a hypothesis-testing analysis 
are now based entirely on P-values. This has necessitated completely rewriting 
Section 8.1, which now introduces hypotheses and then test procedures based on 
P-values. Substantial revision of the remaining sections of Chapter 8 was then 
required, and this in turn has been propagated through the hypothesis-testing 
sections and subsections of Chapters 9-15. 

e Many new examples and exercises, almost all based on real data or actual 
problems. Some of these scenarios are less technical or broader in scope than 
what has been included in previous editions—for example, investigating the 
nocebo effect (the inclination of those told about a drug’s side effects to experi- 
ence them), comparing sodium contents of cereals produced by three different 
manufacturers, predicting patient height from an easy-to-measure anatomical 
characteristic, modeling the relationship between an adolescent mother’s age 
and the birth weight of her baby, assessing the effect of smokers’ short-term 
abstinence on the accurate perception of elapsed time, and exploring the impact 
of phrasing in a quantitative literacy test. 

¢ More examples and exercises in the probability material (Chapters 2—5) are based 
on information from published sources. 
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¢ The exposition has been polished whenever possible to help students gain a better 
intuitive understanding of various concepts. 
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Overview and 


Descriptive Statistics 


“I took statistics at business school, and it was a transformative 
experience. Analytical training gives you a skill set that differen- 
tiates you from most people in the labor market.” 


—LASZLO Bock, SENIOR VICE PRESIDENT OF PEOPLE OPERATIONS (IN CHARGE OF ALL HIRING) AT 
GOOGLE 


April 20, 2014, The New York Times, interview with columnist Thomas Friedman 


“I am not much given to regret, so I puzzled over this one a while. 
Should have taken much more statistics in college, I think.” 
—Max LEVCHIN, PAYPAL CO-FOUNDER, SLIDE FOUNDER 

Quote of the week from the Web site of the American Statistical Association on 
November 23, 2010 


“I keep saying that the sexy job in the next 10 years will be statisti- 
cians, and I’m not kidding.” 


—HAL VARIAN, CHIEF ECONOMIST AT GOOGLE 


August 6, 2009, The New York Times 


INTRODUCTION 


Statistical concepts and methods are not only useful but indeed often indis- 
pensable in understanding the world around us. They provide ways of gaining 
new insights into the behavior of many phenomena that you will encounter in 
your chosen field of specialization in engineering or science. 

The discipline of statistics teaches us how to make intelligent judgments 
and informed decisions in the presence of uncertainty and variation. Without 
uncertainty or variation, there would be little need for statistical methods or stat- 
isticians. If every component of a particular type had exactly the same lifetime, if 
all resistors produced by a certain manufacturer had the same resistance value, 
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if pH determinations for soil specimens from a particular locale gave identical 
results, and so on, then a single observation would reveal all desired information. 

An interesting manifestation of variation appeared in connection with 
determining the “greenest” way to travel. The article “Carbon Conundrum” 
(Consumer Reports, 2008: 9) identified organizations that help consumers 
calculate carbon output. The following results on output for a flight from New 
York to Los Angeles were reported: 


Carbon Calculator CO.) 
Terra Pass 1924 
Conservation International 3000 
Cool It 3049 
World Resources Institute/Safe Climate 3163 
National Wildlife Federation 3465 
Sustainable Travel International 3577 
Native Energy 3960 
Environmental Defense 4000 
Carbonfund.org 4820 
The Climate Trust/CarbonCounter.org 5860 
Bonneville Environmental Foundation 6732 


There is clearly rather substantial disagreement among these calculators 
as to exactly how much carbon is emitted, characterized in the article as “from 
a ballerina’s to Bigfoot's.” A website address was provided where readers could 
learn more about how the various calculators work. 

How can statistical techniques be used to gather information and draw 
conclusions? Suppose, for example, that a materials engineer has developed a 
coating for retarding corrosion in metal pipe under specified circumstances. If 
this coating is applied to different segments of pipe, variation in environmental 
conditions and in the segments themselves will result in more substantial corro- 
sion on some segments than on others. Methods of statistical analysis could be 
used on data from such an experiment to decide whether the average amount 
of corrosion exceeds an upper specification limit of some sort or to predict how 
much corrosion will occur on a single piece of pipe. 

Alternatively, suppose the engineer has developed the coating in the belief 
that it will be superior to the currently used coating. A comparative experiment 
could be carried out to investigate this issue by applying the current coating to 
some segments of pipe and the new coating to other segments. This must be done 
with care lest the wrong conclusion emerge. For example, perhaps the average 
amount of corrosion is identical for the two coatings. However, the new coating 
may be applied to segments that have superior ability to resist corrosion and under 
less stressful environmental conditions compared to the segments and conditions 
for the current coating. The investigator would then likely observe a difference 
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between the two coatings attributable not to the coatings themselves, but just to 
extraneous variation. Statistics offers not only methods for analyzing the results of 
experiments once they have been carried out but also suggestions for how experi- 
ments can be performed in an efficient manner to mitigate the effects of variation 
and have a better chance of producing correct conclusions. 


1.1 Populations, Samples, and Processes 


Engineers and scientists are constantly exposed to collections of facts, or data, both 
in their professional capacities and in everyday activities. The discipline of statistics 
provides methods for organizing and summarizing data and for drawing conclusions 
based on information contained in the data. 

An investigation will typically focus on a well-defined collection of objects 
constituting a population of interest. In one study, the population might consist of 
all gelatin capsules of a particular type produced during a specified period. Another 
investigation might involve the population consisting of all individuals who received 
a B.S. in engineering during the most recent academic year. When desired informa- 
tion is available for all objects in the population, we have what is called a census. 
Constraints on time, money, and other scarce resources usually make a census 
impractical or infeasible. Instead, a subset of the population—a sample—is selected 
in some prescribed manner. Thus we might obtain a sample of bearings from a par- 
ticular production run as a basis for investigating whether bearings are conforming to 
manufacturing specifications, or we might select a sample of last year’s engineering 
graduates to obtain feedback about the quality of the engineering curricula. 

We are usually interested only in certain characteristics of the objects in a pop- 
ulation: the number of flaws on the surface of each casing, the thickness of each capsule 
wall, the gender of an engineering graduate, the age at which the individual graduated, 
and so on. A characteristic may be categorical, such as gender or type of malfunction, 
or it may be numerical in nature. In the former case, the value of the characteristic is 
a category (e.g., female or insufficient solder), whereas in the latter case, the value is a 
number (e.g., age = 23 years or diameter = .502 cm). A variable is any characteristic 
whose value may change from one object to another in the population. We shall initially 
denote variables by lowercase letters from the end of our alphabet. Examples include 


x = brand of calculator owned by a student 


y 


number of visits to a particular Web site during a specified period 


braking distance of an automobile under specified conditions 


Data results from making observations either on a single variable or simultaneously 
on two or more variables. A univariate data set consists of observations on a single 
variable. For example, we might determine the type of transmission, automatic (A) 
or manual (M), on each of ten automobiles recently purchased at a certain dealer- 
ship, resulting in the categorical data set 


MA AAMAAMAA 


The following sample of pulse rates (beats per minute) for patients recently admitted 
to an adult intensive care unit is a numerical univariate data set: 


88 80 71 103 154 132 67 110 60 105 
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We have bivariate data when observations are made on each of two variables. Our 
data set might consist of a (height, weight) pair for each basketball player on a 
team, with the first observation as (72, 168), the second as (75, 212), and so on. If 
an engineer determines the value of both x = component lifetime and y = reason 
for component failure, the resulting data set is bivariate with one variable numeri- 
cal and the other categorical. Multivariate data arises when observations are made 
on more than one variable (so bivariate is a special case of multivariate). For exam- 
ple, a research physician might determine the systolic blood pressure, diastolic blood 
pressure, and serum cholesterol level for each patient participating in a study. 
Each observation would be a triple of numbers, such as (120, 80, 146). In many 
multivariate data sets, some variables are numerical and others are categorical. Thus 
the annual automobile issue of Consumer Reports gives values of such variables as 
type of vehicle (small, sporty, compact, mid-size, large), city fuel efficiency (mpg), 
highway fuel efficiency (mpg), drivetrain type (rear wheel, front wheel, four 
wheel), and so on. 


Branches of Statistics 


An investigator who has collected data may wish simply to summarize and describe 
important features of the data. This entails using methods from descriptive statistics. 
Some of these methods are graphical in nature; the construction of histograms, boxplots, 
and scatter plots are primary examples. Other descriptive methods involve calculation of 
numerical summary measures, such as means, standard deviations, and correlation coef- 
ficients. The wide availability of statistical computer software packages has made these 
tasks much easier to carry out than they used to be. Computers are much more efficient 
than human beings at calculation and the creation of pictures (once they have received 
appropriate instructions from the user!). This means that the investigator doesn’t have 
to expend much effort on “grunt work” and will have more time to study the data and 
extract important messages. Throughout this book, we will present output from various 
packages such as Minitab, SAS, JMP, and R. The R software can be downloaded without 
charge from the site http://www.r-project.org. It has achieved great popularity in the 
statistical community, and many books describing its various uses are available (it does 
entail programming as opposed to the pull-down menus of Minitab and JMP). 


EXAMPLE 1.1 Charity is a big business in the United States. The Web site charitynavigator.com 
gives information on roughly 6000 charitable organizations, and there are many 
smaller charities that fly below the navigator’s radar screen. Some charities operate 
very efficiently, with fundraising and administrative expenses that are only a small 
percentage of total expenses, whereas others spend a high percentage of what they 
take in on such activities. Here is data on fundraising expenses as a percentage of 
total expenditures for a random sample of 60 charities: 


6.1 12.6 34.7 16 188 22 30 22 56 3.8 
22 3:1 1.3 1.1 141 #40 210 £61 1.3 20.4 
7.5 3.9 10.1 8.1 195 52 12.0 15.8 104 5.2 
64 10.8 83.1 36 62 63 163 12.7 13 0.8 
8.8 5.1 3.77 263 60 480 82 11.7 #72 3.9 
15.3 16.6 88 120 47 147 #64 170 25 16.2 


Without any organization, it is difficult to get a sense of the data’s most prominent 
features—what a typical (i.e., representative) value might be, whether values are 
highly concentrated about a typical value or quite dispersed, whether there are any 
gaps in the data, what fraction of the values are less than 20%, and so on. Figure 1.1 
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Figure 1.1 A Minitab stem-and-leaf display (tenths digit truncated) and histogram for the charity 
fundraising percentage data 
shows what is called a stem-and-leaf display as well as a histogram. In Section 1.2 
we will discuss construction and interpretation of these data summaries. For the 
moment, we hope you see how they begin to describe how the percentages are dis- 
tributed over the range of possible values from 0 to 100. Clearly a substantial major- 
ity of the charities in the sample spend less than 20% on fundraising, and only a few 
percentages might be viewed as beyond the bounds of sensible practice. a 
Having obtained a sample from a population, an investigator would frequently 
like to use sample information to draw some type of conclusion (make an inference 
of some sort) about the population. That is, the sample is a means to an end rather 
than an end in itself. Techniques for generalizing from a sample to a population are 
gathered within the branch of our discipline called inferential statistics. 
EXAMPLE 1.2 Material strength investigations provide a rich area of application for statistical methods. 


The article “Effects of Aggregates and Microfillers on the Flexural Properties of 
Concrete” (Magazine of Concrete Research, 1997: 81-98) reported on a study of 
strength properties of high-performance concrete obtained by using superplasticizers 
and certain binders. The compressive strength of such concrete had previously 
been investigated, but not much was known about flexural strength (a measure of 
ability to resist failure in bending). The accompanying data on flexural strength (in 
MegaPascal, MPa, where 1 Pa (Pascal) = 1.45 X 10~* psi) appeared in the article 
cited: 


5.9 7.2 7.3 63 81 68 70 7.6 68 65 70 63 7.9 9.0 
8.2 8.7 7.8 9.7 7.4 7.7 9.7 7.8 7.7 11.6 11.3 11.8 10.7 


Suppose we want an estimate of the average value of flexural strength for all beams 
that could be made in this way (if we conceptualize a population of all such beams, 
we are trying to estimate the population mean). It can be shown that, with a high 
degree of confidence, the population mean strength is between 7.48 MPa and 
8.80 MPa; we call this a confidence interval or interval estimate. Alternatively, this 
data could be used to predict the flexural strength of a single beam of this type. With 
a high degree of confidence, the strength of a single such beam will exceed 
7.35 MPa; the number 7.35 is called a lower prediction bound. H 
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The main focus of this book is on presenting and illustrating methods of 
inferential statistics that are useful in scientific work. The most important types 
of inferential procedures—point estimation, hypothesis testing, and estimation by 
confidence intervals—are introduced in Chapters 6-8 and then used in more com- 
plicated settings in Chapters 9-16. The remainder of this chapter presents methods 
from descriptive statistics that are most used in the development of inference. 

Chapters 2—5 present material from the discipline of probability. This mate- 
rial ultimately forms a bridge between the descriptive and inferential techniques. 
Mastery of probability leads to a better understanding of how inferential procedures 
are developed and used, how statistical conclusions can be translated into everyday 
language and interpreted, and when and where pitfalls can occur in applying the 
methods. Probability and statistics both deal with questions involving populations 
and samples, but do so in an “inverse manner” to one another. 

In a probability problem, properties of the population under study are 
assumed known (e.g., in a numerical population, some specified distribution of the 
population values may be assumed), and questions regarding a sample taken from 
the population are posed and answered. In a statistics problem, characteristics of a 
sample are available to the experimenter, and this information enables the experi- 
menter to draw conclusions about the population. The relationship between the 
two disciplines can be summarized by saying that probability reasons from the popu- 
lation to the sample (deductive reasoning), whereas inferential statistics reasons from 
the sample to the population (inductive reasoning). This is illustrated in Figure 1.2. 


Probability 


<> <> 


statistics 


Figure 1.2 The relationship between probability and inferential statistics 


Before we can understand what a particular sample can tell us about the popu- 
lation, we should first understand the uncertainty associated with taking a sample 
from a given population. This is why we study probability before statistics. 


EXAMPLE 1.3 As an example of the contrasting focus of probability and inferential statistics, con- 
sider drivers’ use of manual lap belts in cars equipped with automatic shoulder belt 
systems. (The article “Automobile Seat Belts: Usage Patterns in Automatic Belt 
Systems,” Human Factors, 1998: 126-135, summarizes usage data.) In probability, 
we might assume that 50% of all drivers of cars equipped in this way in a certain 
metropolitan area regularly use their lap belt (an assumption about the population), 
so we might ask, “How likely is it that a sample of 100 such drivers will include at 
least 70 who regularly use their lap belt?” or “How many of the drivers in a sample 
of size 100 can we expect to regularly use their lap belt?” On the other hand, in infer- 
ential statistics, we have sample information available; for example, a sample of 100 
drivers of such cars revealed that 65 regularly use their lap belt. We might then ask, 
“Does this provide substantial evidence for concluding that more than 50% of all 
such drivers in this area regularly use their lap belt?” In this latter scenario, we are 
attempting to use sample information to answer a question about the structure of the 
entire population from which the sample was selected. | 


In the foregoing lap belt example, the population is well defined and concrete: all 
drivers of cars equipped in a certain way in a particular metropolitan area. In Example 
1.2, however, the strength measurements came from a sample of prototype beams that 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


1.1 Populations, Samples, and Processes 7 


had not been selected from an existing population. Instead, it is convenient to think of 
the population as consisting of all possible strength measurements that might be made 
under similar experimental conditions. Such a population is referred to as a conceptual 
or hypothetical population. There are a number of problem situations in which we fit 
questions into the framework of inferential statistics by conceptualizing a population. 


The Scope of Modern Statistics 


These days statistical methodology is employed by investigators in virtually all dis- 
ciplines, including such areas as 


e molecular biology (analysis of microarray data) 


e ecology (describing quantitatively how individuals in various animal and plant 
populations are spatially distributed) 


e materials engineering (studying properties of various treatments to retard corrosion) 
e marketing (developing market surveys and strategies for marketing new products) 
e public health (identifying sources of diseases and ways to treat them) 


e civil engineering (assessing the effects of stress on structural elements and the 
impacts of traffic flows on communities) 


As you progress through the book, you’ll encounter a wide spectrum of different sce- 
narios in the examples and exercises that illustrate the application of techniques from 
probability and statistics. Many of these scenarios involve data or other material 
extracted from articles in engineering and science journals. The methods presented 
herein have become established and trusted tools in the arsenal of those who work with 
data. Meanwhile, statisticians continue to develop new models for describing rand- 
omness, and uncertainty and new methodology for analyzing data. As evidence of 
the continuing creative efforts in the statistical community, here are titles and capsule 
descriptions of some articles that have recently appeared in statistics journals (Journal 
of the American Statistical Association is abbreviated JASA, and AAS is short for the 
Annals of Applied Statistics, two of the many prominent journals in the discipline): 


e “How Many People Do You Know? Efficiently Estimating Personal 
Network Size” (JASA, 2010: 59-70): How many of the N individuals at your 
college do you know? You could select a random sample of students from the 
population and use an estimate based on the fraction of people in this sam- 
ple that you know. Unfortunately this is very inefficient for large populations 
because the fraction of the population someone knows is typically very small. A 
“latent mixing model” was proposed that the authors asserted remedied deficien- 
cies in previously used techniques. A simulation study of the method’s effec- 
tiveness based on groups consisting of first names (“How many people named 
Michael do you know?”) was included as well as an application of the method to 
actual survey data. The article concluded with some practical guidelines for the 
construction of future surveys designed to estimate social network size. 


e “Active Learning Through Sequential Design, with Applications to the 
Detection of Money Laundering” (JASA, 2009: 969-981): Money launder- 
ing involves concealing the origin of funds obtained through illegal activities. 
The huge number of transactions occurring daily at financial institutions makes 
detection of money laundering difficult. The standard approach has been to 
extract various summary quantities from the transaction history and conduct a 
time-consuming investigation of suspicious activities. The article proposes a 
more efficient statistical method and illustrates its use in a case study. 
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e “Robust Internal Benchmarking and False Discovery Rates for Detecting 
Racial Bias in Police Stops” (JASA, 2009: 661-668): Allegations of police 
actions that are attributable at least in part to racial bias have become a contentious 
issue in many communities. This article proposes a new method that is designed 
to reduce the risk of flagging a substantial number of “false positives” (individuals 
falsely identified as manifesting bias). The method was applied to data on 500,000 
pedestrian stops in New York City in 2006; of the 3000 officers regularly involved 
in pedestrian stops, 15 were identified as having stopped a substantially greater frac- 
tion of Black and Hispanic people than what would be predicted were bias absent. 


¢ “Records in Athletics Through Extreme Value Theory” (JASA, 2008: 
1382-1391): The focus here is on the modeling of extremes related to world 
records in athletics. The authors start by posing two questions: (1) What is the 
ultimate world record within a specific event (e.g., the high jump for women)? 
and (2) How “good” is the current world record, and how does the quality of 
current world records compare across different events? A total of 28 events 
(8 running, 3 throwing, and 3 jumping for both men and women) are consid- 
ered. For example, one conclusion is that only about 20 seconds can be shaved 
off the men’s marathon record, but that the current women’s marathon record 
is almost 5 minutes longer than what can ultimately be achieved. The method- 
ology also has applications to such issues as ensuring airport runways are long 
enough and that dikes in Holland are high enough. 


e “Self-Exciting Hurdle Models for Terrorist Activity’ (AAS, 2012: 106-124): The 
authors developed a predictive model of terrorist activity by considering the daily 
number of terrorist attacks in Indonesia from 1994 through 2007. The model esti- 
mates the chance of future attacks as a function of the times since past attacks. One 
feature of the model considers the excess of nonattack days coupled with the pres- 
ence of multiple coordinated attacks on the same day. The article provides an inter- 
pretation of various model characteristics and assesses its predictive performance. 


“Prediction of Remaining Life of Power Transformers Based on Left 
Truncated and Right Censored Lifetime Data” (AAS, 2009: 857-879): There 
are roughly 150,000 high-voltage power transmission transformers in the United 
States. Unexpected failures can cause substantial economic losses, so it is impor- 
tant to have predictions for remaining lifetimes. Relevant data can be complicated 
because lifetimes of some transformers extend over several decades during which 
records were not necessarily complete. In particular, the authors of the article use 
data from a certain energy company that began keeping careful records in 1980. 
But some transformers had been installed before January 1, 1980, and were still 
in service after that date (“‘left truncated” data), whereas other units were still in 
service at the time of the investigation, so their complete lifetimes are not available 
(“right censored” data). The article describes various procedures for obtaining an 
interval of plausible values (a prediction interval) for a remaining lifetime and for 
the cumulative number of failures over a specified time period. 


“The BARISTA: A Model for Bid Arrivals in Online Auctions” (AAS, 2007: 
412-441): Online auctions such as those on eBay and uBid often have character- 
istics that differentiate them from traditional auctions. One particularly important 
difference is that the number of bidders at the outset of many traditional auctions 
is fixed, whereas in online auctions this number and the number of resulting bids 
are not predetermined. The article proposes a new BARISTA (for Bid ARrivals 
In STAges) model for describing the way in which bids arrive online. The model 
allows for higher bidding intensity at the outset of the auction and also as the 
auction comes to a close. Various properties of the model are investigated and 
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then validated using data from eBay.com on auctions for Palm M515 personal 
assistants, Microsoft Xbox games, and Cartier watches. 


“Statistical Challenges in the Analysis of Cosmic Microwave Background 
Radiation” (AAS, 2009: 61-95): The cosmic microwave background (CMB) 

is a significant source of information about the early history of the universe. Its 
radiation level is uniform, so extremely delicate instruments have been developed 
to measure fluctuations. The authors provide a review of statistical issues with 
CMB data analysis; they also give many examples of the application of statistical 
procedures to data obtained from a recent NASA satellite mission, the Wilkinson 
Microwave Anisotropy Probe. 


Statistical information now appears with increasing frequency in the popular media, 
and occasionally the spotlight is even turned on statisticians. For example, the Noy. 23, 
2009, New York Times reported in an article “Behind Cancer Guidelines, Quest for 
Data” that the new science for cancer investigations and more sophisticated methods 
for data analysis spurred the U.S. Preventive Services task force to re-examine guide- 
lines for how frequently middle-aged and older women should have mammograms. 
The panel commissioned six independent groups to do statistical modeling. The 
result was a new set of conclusions, including an assertion that mammograms every 
two years are nearly as beneficial to patients as annual mammograms, but confer only 
half the risk of harms. Donald Berry, a very prominent biostatistician, was quoted as 
saying he was pleasantly surprised that the task force took the new research to heart in 
making its recommendations. The task force’s report has generated much controversy 
among cancer organizations, politicians, and women themselves. 

It is our hope that you will become increasingly convinced of the importance 
and relevance of the discipline of statistics as you dig more deeply into the book and 
the subject. Hopefully you'll be turned on enough to want to continue your statistical 
education beyond your current course. 


Enumerative Versus Analytic Studies 


W. E. Deming, a very influential American statistician who was a moving force in 
Japan’s quality revolution during the 1950s and 1960s, introduced the distinction 
between enumerative studies and analytic studies. In the former, interest is focused 
on a finite, identifiable, unchanging collection of individuals or objects that make up 
a population. A sampling frame—that is, a listing of the individuals or objects to be 
sampled—is either available to an investigator or else can be constructed. For exam- 
ple, the frame might consist of all signatures on a petition to qualify a certain initia- 
tive for the ballot in an upcoming election; a sample is usually selected to ascertain 
whether the number of valid signatures exceeds a specified value. As another 
example, the frame may contain serial numbers of all furnaces manufactured by a 
particular company during a certain time period; a sample may be selected to infer 
something about the average lifetime of these units. The use of inferential methods 
to be developed in this book is reasonably noncontroversial in such settings (though 
statisticians may still argue over which particular methods should be used). 

An analytic study is broadly defined as one that is not enumerative in nature. Such 
studies are often carried out with the objective of improving a future product by taking 
action on a process of some sort (e.g., recalibrating equipment or adjusting the level of 
some input such as the amount of a catalyst). Data can often be obtained only on an 
existing process, one that may differ in important respects from the future process. There 
is thus no sampling frame listing the individuals or objects of interest. For example, a 
sample of five turbines with a new design may be experimentally manufactured and 
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tested to investigate efficiency. These five could be viewed as a sample from the concep- 
tual population of all prototypes that could be manufactured under similar conditions, 
but not necessarily as representative of the population of units manufactured once regular 
production gets underway. Methods for using sample information to draw conclusions 
about future production units may be problematic. Someone with expertise in the area 
of turbine design and engineering (or whatever other subject area is relevant) should be 
called upon to judge whether such extrapolation is sensible. A good exposition of these 
issues is contained in the article “Assumptions for Statistical Inference” by Gerald 
Hahn and William Meeker (The American Statistician, 1993: 1-11). 


Collecting Data 


Statistics deals not only with the organization and analysis of data once it has been 
collected but also with the development of techniques for collecting the data. If data 
is not properly collected, an investigator may not be able to answer the questions 
under consideration with a reasonable degree of confidence. One common problem 
is that the target population—the one about which conclusions are to be drawn—may 
be different from the population actually sampled. For example, advertisers would 
like various kinds of information about the television-viewing habits of potential cus- 
tomers. The most systematic information of this sort comes from placing monitoring 
devices in a small number of homes across the United States. It has been conjectured 
that placement of such devices in and of itself alters viewing behavior, so that charac- 
teristics of the sample may be different from those of the target population. 

When data collection entails selecting individuals or objects from a frame, the 
simplest method for ensuring a representative selection is to take a simple random 
sample. This is one for which any particular subset of the specified size (e.g., a sam- 
ple of size 100) has the same chance of being selected. For example, if the frame 
consists of 1,000,000 serial numbers, the numbers 1, 2,..., up to 1,000,000 could 
be placed on identical slips of paper. After placing these slips in a box and thor- 
oughly mixing, slips could be drawn one by one until the requisite sample size has 
been obtained. Alternatively (and much to be preferred), a table of random numbers 
or a software package’s random number generator could be employed. 

Sometimes alternative sampling methods can be used to make the selection 
process easier, to obtain extra information, or to increase the degree of confidence in 
conclusions. One such method, stratified sampling, entails separating the population 
units into nonoverlapping groups and taking a sample from each one. For example, 
a study of how physicians feel about the Affordable Care Act might proceed by 
stratifying according to specialty: select a sample of surgeons, another sample of 
radiologists, yet another sample of psychiatrists, and so on. This would result in 
information separately from each specialty and ensure that no one specialty is over- 
or underrepresented in the entire sample. 

Frequently a “convenience” sample is obtained by selecting individuals or 
objects without systematic randomization. As an example, a collection of bricks may 
be stacked in such a way that it is extremely difficult for those in the center to be 
selected. If the bricks on the top and sides of the stack were somehow different from 
the others, resulting sample data would not be representative of the population. Often 
an investigator will assume that such a convenience sample approximates a random 
sample, in which case a Statistician’s repertoire of inferential methods can be used; 
however, this is a judgment call. Most of the methods discussed herein are based on 
a variation of simple random sampling described in Chapter 5. 

Engineers and scientists often collect data by carrying out some sort of designed 
experiment. This may involve deciding how to allocate several different treatments (such 
as fertilizers or coatings for corrosion protection) to the various experimental units (plots 
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of land or pieces of pipe). Alternatively, an investigator may systematically vary the 
levels or categories of certain factors (e.g., pressure or type of insulating material) and 
observe the effect on some response variable (such as yield from a production process). 


EXAMPLE 1.4 Anarticle in the New York Times (Jan. 27, 1987) reported that heart attack risk could be 
reduced by taking aspirin. This conclusion was based on a designed experiment involv- 
ing both a control group of individuals that took a placebo having the appearance of 
aspirin but known to be inert and a treatment group that took aspirin according to a 
specified regimen. Subjects were randomly assigned to the groups to protect against 
any biases and so that probability-based methods could be used to analyze the data. Of 
the 11,034 individuals in the control group, 189 subsequently experienced heart attacks, 
whereas only 104 of the 11,037 in the aspirin group had a heart attack. The incidence 
rate of heart attacks in the treatment group was only about half that in the control group. 
One possible explanation for this result is chance variation—that aspirin really doesn’t 
have the desired effect and the observed difference is just typical variation in the same 
way that tossing two identical coins would usually produce different numbers of heads. 
However, in this case, inferential methods suggest that chance variation by itself cannot 
adequately explain the magnitude of the observed difference. | 


EXAMPLE 1.5 An engineer wishes to investigate the effects of both adhesive type and conductor 
material on bond strength when mounting an integrated circuit (IC) on a certain sub- 
strate. Two adhesive types and two conductor materials are under consideration. 
Two observations are made for each adhesive-type/conductor-material combination, 
resulting in the accompanying data: 


Adhesive Type Conductor Material Observed Bond Strength Average 
1 1 82, 77 79.5 
1 2 75, 87 81.0 
2 1 84, 80 82.0 
2 2 78, 90 84.0 


The resulting average bond strengths are pictured in Figure 1.3. It appears that adhe- 
sive type 2 improves bond strength as compared with type | by about the same 
amount whichever one of the conducting materials is used, with the 2, 2 combin- 
ation being best. Inferential methods can again be used to judge whether these 
effects are real or simply due to chance variation. 
Average 
strength 


855 
Adhesive type 2 


Adhesive type 1 
80-5 oe 


= 
N 


Conducting material 


Figure 1.3 Average bond strengths in Example 1.5 


Suppose additionally that there are two cure times under consideration and also two 
types of IC post coating. There are then 2 - 2 - 2 - 2 = 16 combinations of these four 
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factors, and our engineer may not have enough resources to make even a single observa- 
tion for each of these combinations. In Chapter 11, we will see how the careful selection 
of a fraction of these possibilities will usually yield the desired information. a 


EXERCISES Section 1.1 (1-9) 


1. 


Give one possible sample of size 4 from each of the fol- 

lowing populations: 

a. All daily newspapers published in the United States 

b. All companies listed on the New York Stock 
Exchange 

c. All students at your college or university 
All grade point averages of students at your college 
or university 


For each of the following hypothetical populations, give 

a plausible sample of size 4: 

a. All distances that might result when you throw a 
football 

b. Page lengths of books published 5 years from now 

c. All possible earthquake-strength measurements 
(Richter scale) that might be recorded in California 
during the next year 

d. All possible yields (in grams) from a certain chemi- 
cal reaction carried out in a laboratory 


Consider the population consisting of all computers of a 

certain brand and model, and focus on whether a com- 

puter needs service while under warranty. 

a. Pose several probability questions based on selecting 
a sample of 100 such computers. 

b. What inferential statistics question might be answered 
by determining the number of such computers in a 
sample of size 100 that need warranty service? 


a. Give three different examples of concrete popula- 
tions and three different examples of hypothetical 
populations. 

b. For one each of your concrete and your hypothetical 
populations, give an example of a probability question 
and an example of an inferential statistics question. 


Many universities and colleges have instituted supplemen- 
tal instruction (SI) programs, in which a student facilitator 
meets regularly with a small group of students enrolled in 
the course to promote discussion of course material and 
enhance subject mastery. Suppose that students in a large 
statistics course (what else?) are randomly divided into a 
control group that will not participate in SI and a treatment 
group that will participate. At the end of the term, each 
student’s total score in the course is determined. 
a. Are the scores from the SI group a sample from an 
existing population? If so, what is it? If not, what is 
the relevant conceptual population? 
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b. What do you think is the advantage of randomly 
dividing the students into the two groups rather than 
letting each student choose which group to join? 

c. Why didn’t the investigators put all students in the treat- 
ment group? [Note: The article ‘Supplemental 
Instruction: An Effective Component of Student 
Affairs Programming” (J. of College Student Devel., 
1997: 577-586) discusses the analysis of data from 
several SI programs.] 


The California State University (CSU) system consists 
of 23 campuses, from San Diego State in the south to 
Humboldt State near the Oregon border. A CSU admin- 
istrator wishes to make an inference about the average 
distance between the hometowns of students and their 
campuses. Describe and discuss several different sam- 
pling methods that might be employed. Would this be 
an enumerative or an analytic study? Explain your 
reasoning. 


A certain city divides naturally into ten district neighbor- 
hoods. How might a real estate appraiser select a sample 
of single-family homes that could be used as a basis for 
developing an equation to predict appraised value from 
characteristics such as age, size, number of bathrooms, 
distance to the nearest school, and so on? Is the study 
enumerative or analytic? 


The amount of flow through a solenoid valve in an auto- 
mobile’s pollution-control system is an important char- 
acteristic. An experiment was carried out to study how 
flow rate depended on three factors: armature length, 
spring load, and bobbin depth. Two different levels (low 
and high) of each factor were chosen, and a single 
observation on flow was made for each combination of 
levels. 
a. The resulting data set consisted of how many 
observations? 
b. Is this an enumerative or analytic study? Explain 
your reasoning. 


In a famous experiment carried out in 1882, Michelson 
and Newcomb obtained 66 observations on the time it 
took for light to travel between two locations in 
Washington, D.C. A few of the measurements (coded in 
a certain manner) were 31, 23, 32, 36, —2, 26, 27, and 31. 
a. Why are these measurements not identical? 

b. Is this an enumerative study? Why or why not? 
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1.2 Pictorial and Tabular Methods 


in Descriptive Statistics 


Descriptive statistics can be divided into two general subject areas. In this section, we 
consider representing a data set using visual displays. In Sections 1.3 and 1.4, we will 
develop some numerical summary measures for data sets. Many visual techniques 
may already be familiar to you: frequency tables, tally sheets, histograms, pie charts, 
bar graphs, scatter diagrams, and the like. Here we focus on a selected few of these 
techniques that are most useful and relevant to probability and inferential statistics. 


Notation 


Some general notation will make it easier to apply our methods and formulas to a 
wide variety of practical problems. The number of observations in a single sample, 
that is, the sample size, will often be denoted by n, so that n = 4 for the sample of 
universities {Stanford, lowa State, Wyoming, Rochester} and also for the sample of 
pH measurements {6.3, 6.2, 5.9, 6.5}. If two samples are simultaneously under con- 
sideration, either m and n or n, and n, can be used to denote the numbers of observa- 
tions. An experiment to compare thermal efficiencies for two different types of diesel 
engines might result in samples {29.7, 31.6, 30.9} and {28.7, 29.5, 29.4, 30.3}, in 
which case m = 3 and n = 4. 

Given a data set consisting of n observations on some variable x, the individ- 
ual observations will be denoted by x,, x,, x3,..., x, The subscript bears no relation 
to the magnitude of a particular observation. Thus x, will not in general be the small- 
est observation in the set, nor will x, typically be the largest. In many applications, 
x, will be the first observation gathered by the experimenter, x, the second, and so 
on. The ith observation in the data set will be denoted by x,. 


Stem-and-Leaf Displays 


Consider a numerical data set x,, x5,..., x,, for which each x; consists of at least two 
digits. A quick way to obtain an informative visual representation of the data set is 
to construct a stem-and-leaf display. 


Constructing a Stem-and-Leaf Display 

1. Select one or more leading digits for the stem values. The trailing digits 
become the leaves. 

2. List possible stem values in a vertical column. 

3. Record the leaf for each observation beside the corresponding stem value. 


4. Indicate the units for stems and leaves someplace in the display. 


For a data set consisting of exam scores, each between 0 and 100, the score of 83 
would have a stem of 8 and a leaf of 3. If all exam scores are in the 90s, 80s, and 
70s (an instructor’s dream!), use of the tens digit as the stem would give a display 
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with only three rows. In this case, it is desirable to stretch the display by repeating 
each stem value twice—9H, 9L, 8H, ...,7L—once for high leaves 9,..., 5 and 
again for low leaves 4, ..., 0. Then a score of 93 would have a stem of 9L and 
leaf of 3. In general, a display based on between 5 and 20 stems is recommended. 


EXAMPLE 1.6 A common complaint among college students is that they are getting less sleep than 
they need. The article “Class Start Times, Sleep, and Academic Performance in 
College: A Path Analysis” (Chronobiology Initl., 2012: 318-335) investigated fac- 
tors that impact sleep time. The stem-and-leaf display in Figure 1.4 shows the average 
number of hours of sleep per day over a two-week period for a sample of 253 students. 


5L 00 

5H 6889 Stem: ones digit 
6L 000111123444444 Leaf: tenths digit 
6H | 55556778899999 

7L 00001 1111112222223333333344444444 

TH = | 55555555666666666666777777888888888999999999999999 

8L 0000000000001 11 11122222222222222222333333333334444444444444 
8H =| 5555555566666666677777788888888899999999999 

9L 00001111111222223334 

9H | 666678999 

10L | 00 

10H | 56 


Figure 1.4 Stem-and-leaf display for average sleep time per day 


The first observation in the top row of the display is 5.0, corresponding to a 
stem of 5 and leaf of 0, and the last observation at the bottom of the display is 10.6. 
Note that in the absence of a context, without the identification of stem and leaf 
digits in the display, we wouldn’t know whether the observation with stem 7 and 
leaf 9 was .79, 7.9, or 79. The leaves in each row are ordered from smallest to larg- 
est; this is commonly done by software packages but is not necessary if a display is 
created by hand. 

The display suggests that a typical or representative sleep time is in the stem 
8L row, perhaps 8.1 or 8.2. The data is not highly concentrated about this typical 
value as would be the case if almost all students were getting between 7.5 and 9.5 
hours of sleep on average. The display appears to rise rather smoothly to a peak in 
the 8L row and then decline smoothly (we conjecture that the minor peak in the 6L 
row would disappear if more data was available). The general shape of the display 
is rather symmetric, bearing strong resemblance to a bell-shaped curve; it does not 
stretch out more in one direction than the other. The two smallest and two largest 
values seem a bit separated from the remainder of the data—perhaps they are very 
mild, but certainly not extreme,“outliers”. A reference in the cited article suggests 
that individuals in this age group need about 8.4 hours of sleep per day. So it appears 
that a substantial percentage of students in the sample are sleep deprived. | 


A stem-and-leaf display conveys information about the following aspects of 
the data: 
e identification of a typical or representative value 
e extent of spread about the typical value 
e presence of any gaps in the data 
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EXAMPLE 1.7 
64 | 35 64 
65 | 26 27 
66 | 05 94 
67 | 90 70 
68 | 90 70 
69 | 00 27 
70 | 51 05 
71 | 31 69 
72 | 80 09 
EXAMPLE 1.8 
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e extent of symmetry in the distribution of values 
e number and locations of peaks 


e presence of any outliers—values far from the rest of the data 


Figure 1.5 presents stem-and-leaf displays for a random sample of lengths of golf 
courses (yards) that have been designated by Golf Magazine as among the most chal- 
lenging in the United States. Among the sample of 40 courses, the shortest is 6433 yards 
long, and the longest is 7280 yards. The lengths appear to be distributed in a roughly 
uniform fashion over the range of values in the sample. Notice that a stem choice here of 
either a single digit (6 or 7) or three digits (643, ... , 728) would yield an uninformative 
display, the first because of too few stems and the latter because of too many. 


33 70 Stem: Thousands and hundreds digits Stem-and-leaf of yardage N= 40 
06 83 Leaf: Tens and ones digits Leaf Unit =10 
14 4 64 3367 
8 65 0228 
00 98 70 45 13 Pi Be: eats 
73 50 18 67 0147799 
36 04 (4) 68 5779 
11 40 50 22 18 69 0023 
14 70 012455 
68 05 13° 65 8 71 013666 
2 F2 08 


(a) (b) 


Figure 1.5 Stem-and-leaf displays of golf course lengths: (a) two-digit leaves; (b) display from 
Minitab with truncated one-digit leaves 


Statistical software packages do not generally produce displays with multiple- 
digit stems. The Minitab display in Figure 1.5(b) results from truncating each obser- 
vation by deleting the ones digit. | 


Dotplots 


A dotplot is an attractive summary of numerical data when the data set is reasonably 
small or there are relatively few distinct data values. Each observation is represented 
by a dot above the corresponding location on a horizontal measurement scale. When 
a value occurs more than once, there is a dot for each occurrence, and these dots are 
stacked vertically. As with a stem-and-leaf display, a dotplot gives information about 
location, spread, extremes, and gaps. 


There is growing concern in the U.S. that not enough students are graduating from 
college. America used to be number | in the world for the percentage of adults 
with college degrees, but it has recently dropped to 16th. Here is data on the 
percentage of 25- to 34-year-olds in each state who had some type of postsecond- 
ary degree as of 2010 (listed in alphabetical order, with the District of Columbia 
included): 
31.5 32.9 33.0 28.6 37.9 43.3 45.9 37.2 68.8 36.2 35.5 
40.5 37.2 45.3 36.1 45.5 42.3 33.3 30.3 37.2 45.5 54.3 
37.2 49.8 32.1 39.3 40.3 442 284 460 47.2 28.7 49.6 
37.6 50.8 38.0 30.8 37.6 43.9 42.5 35.2 42.2 32.8 32.2 
38.5 445 446 409 295 41.3 35.4 
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Figure 1.6 shows a dotplot of the data. Dots corresponding to some values close together 
(e.g., 28.6 and 28.7) have been vertically stacked to prevent crowding. There is clearly a 
great deal of state-to-state variability. The largest value, for D.C., is obviously an extreme 
outlier, and four other values on the upper end of the data are candidates for mild outliers 
(MA, MN, NY, and ND). There is also a cluster of states at the low end, primarily located 
in the South and Southwest. The overall percentage for the entire country is 39.3%; this 
is not a simple average of the 51 numbers but an average weighted by population sizes. 


e ee e 
oo ce cccee ceeece e 
| ee + eee + @eeoe + @#eee + ee + eo e | | \ e | 
25 30 35 40 45 50 55 60 65 70 
Figure 1.6 A dotplot of the data from Example 1.8 || 


A dotplot can be quite cumbersome to construct and look crowded when the 
number of observations is large. Our next technique is well suited to such situations. 


Histograms 


Some numerical data is obtained by counting to determine the value of a variable (the 
number of traffic citations a person received during the last year, the number of custom- 
ers arriving for service during a particular period), whereas other data is obtained by 
taking measurements (weight of an individual, reaction time to a particular stimulus). 
The prescription for drawing a histogram is generally different for these two cases. 


DEFINITION A numerical variable is discrete if its set of possible values either is finite or 
else can be listed in an infinite sequence (one in which there is a first number, 
a second number, and so on). A numerical variable is continuous if its possible 
values consist of an entire interval on the number line. 


A discrete variable x almost always results from counting, in which case pos- 
sible values are 0, 1, 2, 3, ... or some subset of these integers. Continuous variables 
arise from making measurements. For example, if x is the pH of a chemical sub- 
stance, then in theory x could be any number between 0 and 14: 7.0, 7.03, 7.032, and 
so on. Of course, in practice there are limitations on the degree of accuracy of any 
measuring instrument, so we may not be able to determine pH, reaction time, height, 
and concentration to an arbitrarily large number of decimal places. However, from 
the point of view of creating mathematical models for distributions of data, it is help- 
ful to imagine an entire continuum of possible values. 

Consider data consisting of observations on a discrete variable x. The frequency 
of any particular x value is the number of times that value occurs in the data set. The 
relative frequency of a value is the fraction or proportion of times the value occurs: 


number of times the value occurs 


relative frequency of a value = , ; 
number of observations in the data set 


Suppose, for example, that our data set consists of 200 observations on x = the number 
of courses a college student is taking this term. If 70 of these x values are 3, then 


frequency of the x value 3: 70 
70 


= 35 
200 


relative frequency of the x value 3: 
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Multiplying a relative frequency by 100 gives a percentage; in the college-course 
example, 35% of the students in the sample are taking three courses. The relative 
frequencies, or percentages, are usually of more interest than the frequencies them- 
selves. In theory, the relative frequencies should sum to 1, but in practice the sum 
may differ slightly from 1 because of rounding. A frequency distribution is a tabu- 
lation of the frequencies and/or relative frequencies. 


Constructing a Histogram for Discrete Data 


First, determine the frequency and relative frequency of each x value. Then mark 
possible x values on a horizontal scale. Above each value, draw a rectangle whose 
height is the relative frequency (or alternatively, the frequency) of that value; the 
rectangles should have equal widths. 


This construction ensures that the area of each rectangle is proportional to the rela- 
tive frequency of the value. Thus if the relative frequencies of x = | and x = 5 are 
.35 and .07, respectively, then the area of the rectangle above | is five times the area 
of the rectangle above 5. 


EXAMPLE 1.9 How unusual is a no-hitter or a one-hitter in a major league baseball game, and how 
frequently does a team get more than 10, 15, or even 20 hits? Table 1.1 is a frequency 
distribution for the number of hits per team per game for all nine-inning games that 
were played between 1989 and 1993. 


Table 1.1 Frequency Distribution for Hits in Nine-Inning Games 


Number Relative Number of Relative 
Hits/Game of Games Frequency Hits/Game Games Frequency 

0 20 .0010 14 569 .0294 
1 V2 .0037 15 393 .0203 
2 209 .0108 16 253 0131 
3 527 .0272 17 171 0088 
4 1048 0541 18 97 0050 
5 1457 .0752 19 53 .0027 
6 1988 1026 20 31 .0016 
7 2256 1164 21 19 .0010 
8 2403 .1240 22 13 .0007 
9 2256 1164 23 =) .0003 
10 1967 1015 24 1 .0001 
11 1509 .0779 25 0 .0000 
12 1230 .0635 26 1 .0001 
13 834 .0430 27 1 .0001 
19,383 1.0005 


The corresponding histogram in Figure 1.7 rises rather smoothly to a single peak and 
then declines. The histogram extends a bit more on the right (toward large values) 
than it does on the left—a slight “positive skew.” 
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Relative frequency 


.10 


.05 


0 Hits/game 
0 10 20 


Figure 1.7 Histogram of number of hits per nine-inning game 


Either from the tabulated information or from the histogram itself, we can determine 
the following: 


relative relative relative 
proportion of games with = frequency + frequency + frequency 
at most two hits forx=0O forx=1 for x = 2 


.0010 + .0037 + .0108 = .0155 


Similarly, 


proportion of games with = .0752 + .1026 + --- + .1015 = .6361 
between 5 and 10 hits (inclusive) 


That is, roughly 64% of all these games resulted in between 5 and 10 (inclusive) 
hits. | 


Constructing a histogram for continuous data (measurements) entails subdividing 
the measurement axis into a suitable number of class intervals or classes, such that 
each observation is contained in exactly one class. Suppose, for example, that we 
have 50 observations on x = fuel efficiency of an automobile (mpg), the smallest of 
which is 27.8 and the largest of which is 31.4. Then we could use the class bounda- 
ries 27.5, 28.0, 28.5, ... , and 31.5 as shown here: 


One potential difficulty is that occasionally an observation lies on a class boundary so 
therefore does not fall in exactly one interval, for example, 29.0. One way to deal with 
this problem is to use boundaries like 27.55, 28.05, ..., 31.55. Adding a hundredths digit 
to the class boundaries prevents observations from falling on the resulting boundaries. 
Another approach is to use the classes 27.5—< 28.0, 28.0—< 28.5,...,31.0—<31.5. 
Then 29.0 falls in the class 29.0—< 29.5 rather than in the class 28.5—< 29.0. In 
other words, with this convention, an observation on a boundary is placed in the inter- 
val to the right of the boundary. This is how Minitab constructs a histogram. 
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Constructing a Histogram for Continuous Data: Equal Class Widths 


Determine the frequency and relative frequency for each class. Mark the 
class boundaries on a horizontal measurement axis. Above each class inter- 
val, draw a rectangle whose height is the corresponding relative frequency 
(or frequency). 


EXAMPLE 1.10 Power companies need information about customer usage to obtain accurate fore- 
casts of demands. Investigators from Wisconsin Power and Light determined energy 
consumption (BTUs) during a particular period for a sample of 90 gas-heated 
homes. An adjusted consumption value was calculated as follows: 


consumption 


adjusted consumption = : 
(weather, in degree days)(house area) 
This resulted in the accompanying data (part of the stored data set FURNACE.MTW 
available in Minitab), which we have ordered from smallest to largest. 


2.97 400 5.20 556 594 598 635 6.62 6.72 6.78 
6.80 685 694 715 7.16 7.23 7.29 7.62 7.62 7.69 
7.73 7.87 7.93 8.00 8.26 829 837 847 8.54 8.58 
8.61 867 869 881 907 9.27 9.37 943 9.52 9.58 
9.60 9.76 9.82 9.83 9.83 9.84 9.96 10.04 10.21 10.28 
10.28 10.30 10.35 10.36 10.40 10.49 10.50 10.64 10.95 11.09 
11.12 11.21 11.29 11.43 11.62 11.70 11.70 12.16 12.19 12.28 
12.31 12.62 12.69 12.71 12.91 12.92 13.11 13.38 13.42 13.43 
13.47 13.60 13.96 14.24 14.35 15.12 15.24 16.06 16.90 18.26 


The most striking feature of the histogram in Figure 1.8 is its resemblance to a bell- 
shaped curve, with the point of symmetry roughly at 10. 


Class l=<3 3=<5 5—<7 7=<9 9=<11 WI=—<13. 13=<15 15=<17 17—<19 


Frequency 1 1 11 21 25 17 9 4 1 
Relative O11  Oll .122 233 278 .189 .100 044 O11 
frequency 
30 + 
20 + 
‘S 
® 
iS) 
a 
10 
0 4 


1 3 5 7 9 11 13 15 17 19 
BTU 


Figure 1.8 Histogram of the energy consumption data from Example 1.10 
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From the histogram, 


proportion of 
observations 


less than 9 


34 
= 01+ 01 + .12 4+ .23 = .37 (exact value = 90 = .378) 


The relative frequency for the 9-<11 class is about .27, so we estimate that roughly 
half of this, or .135, is between 9 and 10. Thus 


proportion of observations . 
less than iO = 37 + .135 = .505 (slightly more than 50%) 


The exact value of this proportion is 47/90 = .522. ea 


There are no hard-and-fast rules concerning either the number of classes or the 
choice of classes themselves. Between 5 and 20 classes will be satisfactory for most 
data sets. Generally, the larger the number of observations in a data set, the more 
classes should be used. A reasonable rule of thumb is 


number of classes ~ number of observations 


Equal-width classes may not be a sensible choice if there are some regions of the 
measurement scale that have a high concentration of data values and other parts 
where data is quite sparse. Figure 1.9 shows a dotplot of such a data set; there is 
high concentration in the middle, and relatively few observations stretched out to 
either side. Using a small number of equal-width classes results in almost all obser- 
vations falling in just one or two of the classes. If a large number of equal-width 
classes are used, many classes will have zero frequency. A sound choice is to use a 
few wider intervals near extreme observations and narrower intervals in the region 
of high concentration. 


Figure 1.9 Selecting class intervals for “varying density” data: (a) many short equal-width 
intervals; (b) a few wide equal-width intervals; (c) unequal-width intervals 


Constructing a Histogram for Continuous Data: Unequal Class Widths 


After determining frequencies and relative frequencies, calculate the height of 
each rectangle using the formula 


relative frequency of the class 


rectangle height = 5; 
class width 
The resulting rectangle heights are usually called densities, and the vertical 
scale is the density scale. This prescription will also work when class widths 
are equal. 
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EXAMPLE 1.11 Corrosion of reinforcing steel is a serious problem in concrete structures located 
in environments affected by severe weather conditions. For this reason, research- 
ers have been investigating the use of reinforcing bars made of composite material. 
One study was carried out to develop guidelines for bonding glass-fiber-reinforced 
plastic rebars to concrete (“Design Recommendations for Bond of GFRP Rebars 
to Concrete,” J. of Structural Engr., 1996: 247-254). Consider the following 48 
observations on measured bond strength: 


115 121 99 93 78 62 66 70 134 17.1 93 5.6 
5.7 54 52 5.1 49 10.7 152 85 42 40 39 3.8 
3.6 34 206 25.5 13.8 126 13.1 89 82 10.7 142 7.6 
5.2 55 S51 50 52 48 41 38 3.7 36 36 3.6 


Class 2-<4 4-<6 6-<8 8-<12 12-—<20 20-<30 
Frequency 9 15 5 9 8 2 
Relative frequency .1875 3125 1042 1875 .1667 0417 
Density 094 156 052 047 021 .004 


The resulting histogram appears in Figure 1.10. The right or upper tail stretches 
out much farther than does the left or lower tail—a substantial departure from 
symmetry. 
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Figure 1.10 A Minitab density histogram for the bond strength data of Example 1.11 i 


When class widths are unequal, not using a density scale will give a pic- 
ture with distorted areas. For equal-class widths, the divisor is the same in each 
density calculation, and the extra arithmetic simply results in a rescaling of the 
vertical axis (i.e., the histogram using relative frequency and the one using den- 
sity will have exactly the same appearance). A density histogram does have one 
interesting property. Multiplying both sides of the formula for density by the class 
width gives 


relative frequency = (class width)(density) = (rectangle width)(rectangle height) 


= rectangle area 


That is, the area of each rectangle is the relative frequency of the corresponding 
class. Furthermore, since the sum of relative frequencies should be 1, the total area 
of all rectangles in a density histogram is |. It is always possible to draw a histogram 
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so that the area equals the relative frequency (this is true also for a histogram of dis- 
crete data)—just use the density scale. This property will play an important role in 
motivating models for distributions in Chapter 4. 


Histogram Shapes 


Histograms come in a variety of shapes. A unimodal histogram is one that rises to 
a single peak and then declines. A bimodal histogram has two different peaks. 
Bimodality can occur when the data set consists of observations on two quite 
different kinds of individuals or objects. For example, consider a large data set 
consisting of driving times for automobiles traveling between San Luis Obispo, 
California, and Monterey, California (exclusive of stopping time for sightseeing, 
eating, etc.). This histogram would show two peaks: one for those cars that took the 
inland route (roughly 2.5 hours) and another for those cars traveling up the coast 
(3.5—4 hours). However, bimodality does not automatically follow in such situa- 
tions. Only if the two separate histograms are “far apart” relative to their spreads 
will bimodality occur in the histogram of combined data. Thus a large data set 
consisting of heights of college students should not result in a bimodal histogram 
because the typical male height of about 69 inches is not far enough above the typi- 
cal female height of about 64-65 inches. A histogram with more than two peaks 
is said to be multimodal. Of course, the number of peaks may well depend on the 
choice of class intervals, particularly with a small number of observations. The 
larger the number of classes, the more likely it is that bimodality or multimodality 
will manifest itself. 


EXAMPLE 1.12 Figure 1.11(a) shows a Minitab histogram of the weights (lb) of the 124 play- 
ers listed on the rosters of the San Francisco 49ers and the New England Patriots 
(teams the author would like to see meet in the Super Bowl) as of Nov. 20, 2009. 
Figure 1.11(b) is a smoothed histogram (actually what is called a density estimate) 
of the data from the R software package. Both the histogram and the smoothed his- 
togram show three distinct peaks; the one on the right is for linemen, the middle 
peak corresponds to linebacker weights, and the peak on the left is for all other 
players (wide receivers, quarterbacks, etc.). 


Percent 


180 200 220 240 260 280 300 320 340 
Weight 
(a) 


Figure 1.11 NFL player weights (a) Histogram (b) Smoothed histogram 
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Density Estimate 
0.000 0.002 0.004 0.006 0.008 0.010 0.012 
! 


150 200 250 300 350 
Player Weight 
(b) 
Figure 1.11 (continued) ies 


A histogram is symmetric if the left half is a mirror image of the right half. A 
unimodal histogram is positively skewed if the right or upper tail is stretched out 
compared with the left or lower tail and negatively skewed if the stretching is to 
the left. Figure 1.12 shows “smoothed” histograms, obtained by superimposing a 
smooth curve on the rectangles, that illustrate the various possibilities. 


(a) (b) (c) (d) 


Figure 1.12 Smoothed histograms: (a) symmetric unimodal; (b) bimodal; (c) positively skewed; 
and (d) negatively skewed 


Qualitative Data 


Both a frequency distribution and a histogram can be constructed when the data 
set is qualitative (categorical) in nature. In some cases, there will be a natural 
ordering of classes—for example, freshmen, sophomores, juniors, seniors, graduate 
students—whereas in other cases the order will be arbitrary—for example, Catholic, 
Jewish, Protestant, and the like. With such categorical data, the intervals above which 
rectangles are constructed should have equal width. 


EXAMPLE 1.13 The Public Policy Institute of California carried out a telephone survey of 2501 
California adult residents during April 2006 to ascertain how they felt about various 
aspects of K—12 public education. One question asked was “Overall, how would you 
rate the quality of public schools in your neighborhood today?” Table 1.2 displays 
the frequencies and relative frequencies, and Figure 1.13 shows the corresponding 
histogram (bar chart). 
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Table 1.2 Frequency Distribution for the School Rating Data 


Rating Frequency Relative Frequency 

A 478 S191) 

B 893 357 

C 680 272 

D 178 O71 

F 100 .040 
Don’t know 172 .069 
2501 1.000 


Chart of Relative Frequency vs. Rating 
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Figure 1.13 Histogram of the school rating data from Minitab 


More than half the respondents gave an A or B rating, and only slightly more than 
10% gave a D or F rating. The percentages for parents of public school children were 
somewhat more favorable to schools: 24%, 40%, 24%, 6%, 4%, and 2%. | 


Multivariate Data 


Multivariate data is generally rather difficult to describe visually. Several methods for 
doing so appear later in the book, notably scatterplots for bivariate numerical data. 


EXERCISES Section 1.2 (10-32) 


10. Consider the strength data for beams given in Example d. What proportion of strength observations in this 
1.2; sample exceed 10 MPa? 
a. Construct a stem-and-leaf display of the data. 11. The accompanying specific gravity values for various 


What appears to be a representative strength 


wood types used in construction appeared in the article 
value? Do the observations appear to be highly 


: “Bolted Connection Design Values Based on European 
concentrated about the representative value or Yield Model” (J. of Structural Engr., 1993: 2169-2186): 
rather spread out? 


b. Does the display appear to be reasonably symmetric 31035) 36 36 37 38 40 40 40 


about a representative value, or would you describe AL Al 42 42 42 42 A2 43 44 
its shape in some other way? AS 46 46 47 48 48 48 51 54 
c. Do there appear to be any outlying strength values? 54 55 58 62 66 66 67 .68 .75 
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12. 


13. 


122.2 
127.5 
130.4 
131.8 
132.7 
133.2 
134.0 
134.7 
135.2: 
135.7 
135.9 
136.6 
137.8 
138.4 
139.1 
140.9 
143.6 


Construct a stem-and-leaf display using repeated 
stems, and comment on any interesting features of the 
display. 

The accompanying summary data on CeO, particle 
sizes (nm) under certain experimental conditions was 
read from a graph in the article ‘‘Nanoceria— 
Energetics of Surfaces, Interfaces and Water 
Adsorption” (J. of the Amer. Ceramic Soc., 2011: 
3992-3999): 


3.0-<3.5 3.5-<4.0 4.0-<4.5 4.5-<5.0 5.0-<5.5 
5 15 27 34 22 


5.5-<6.0 6.0-<6.5 6.5-<7.0 7.0-<7.5 7.5-<8.0 
14 7 2 4 1 


a. What proportion of the observations are less than 5? 
What proportion of the observations are at least 6? 

ce. Construct a histogram with relative frequency on 
the vertical axis and comment on interesting fea- 
tures. In particular, does the distribution of parti- 
cle sizes appear to be reasonably symmetric or 
somewhat skewed? [Note: The investigators fit a 
lognormal distribution to the data; this is dis- 
cussed in Chapter 4.] 

d. Construct a histogram with density on the vertical 
axis and compare to the histogram in (c). 


Allowable mechanical properties for structural design 
of metallic aerospace vehicles requires an approved 
method for statistically analyzing empirical test data. 
The article “Establishing Mechanical Property 
Allowables for Metals” (J. of Testing and Evaluation, 
1998: 293-299) used the accompanying data on ten- 
sile ultimate strength (ksi) as a basis for addressing 
the difficulties in developing such a method. 


124.2 
127.9 
130.8 
132.3 
132.9 
133.3 
134.0 
134.7 
135.2 
135.8 
136.0 
136.8 
137.8 
138.4 
139.5 
140.9 
143.8 


124.3 
128.6 
131.3 
132.4 
133.0 
133.3 
134.0 
134.7 
135.3 
135.8 
136.0 
136.9 
137.8 
138.4 
139.6 
141.2 
143.8 


125.6 
128.8 
131.4 
132.4 
133.1 
133.5 
134.1 
134.8 
135.3 
135.8 
136.1 
136.9 
137.9 
138.5 
139.8 
141.4 
143.9 


126.3 
129.0 
131.4 
132.5 
133.1 
133.5 
134.2 
134.8 
135.4 
135.8 
136.2 
137.0 
137.9 
138.5 
139.8 
141.5 
144.1 


126.5 
129.2 
131.5 
132.5 
133.1 
133.5 
134.3 
134.8 
135.5 
135.8 
136.2 
137.1 
138.2 
138.6 
140.0 
141.6 
144.5 


126.5 
129.4 
131.6 
132.5 
133.1 
133.8 
134.4 
134.9 
135.5 
135.9 
136.3 
137.2 
138.2 
138.7 
140.0 
142.9 
144.5 


127.2 
129.6 
131.6 
132.5 
133.2 
133.9 
134.4 
134.9 
135.6 
135.9 
136.4 
137.6 
138.3 
138.7 
140.7 
143.4 
147.7 


127.3 
130.2 
131.8 
132.6 
133.2 
134.0 
134.6 
135.2 
135.6 
135.9 
136.4 
137.6 
138.3 
139.0 
140.7 
143.5 
147.7 


14. 


4.6 
11.2 
ee) 
8.3 
5.4 
7.6 
5.4 
8.4 
Dl 
10.8 
7.8 
9.3 
8.3 


15. 


Am: 


Fr: 
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a. Construct a stem-and-leaf display of the data by 
first deleting (truncating) the tenths digit and then 
repeating each stem value five times (once for 
leaves | and 2, a second time for leaves 3 and 4, 
etc.). Why is it relatively easy to identify a repre- 
sentative strength value? 

b. Construct a histogram using equal-width classes 
with the first class having a lower limit of 122 and an 
upper limit of 124. Then comment on any interesting 
features of the histogram. 


The accompanying data set consists of observations 
on shower-flow rate (L/min) for a sample of n = 129 
houses in Perth, Australia (“An Application of Bayes 
Methodology to the Analysis of Diary Records in a 
Water Use Study,” J. Amer. Stat. Assoc., 1987: 
705-711): 


123 71 70 40 92 67 69 115 5.1 
105 143 80 88 64 5.1 56 96 7.5 
62 5.8 23 34 104 98 66 3.7 64 
65 76 93 92 7.3 5.0 63 13.8 6.2 
48 75 60 69 108 75 66 5.0 3.3 
3:9 11.9 22 15.0 7.2 61 15.3 18:9 7.2 
5.5 43 90 12.7 113 74 50 3.5 8.2 
73 103 11.9 60 56 95 9.3 104 9.7 
67 10.2 62 84 70 48 5.6 105 14.6 
55 75 64 34 55 66 5.9 15.0 9.6 
70 69 41 #36 11.9 3.7 5.7) 68 11.3 
96 104 93 69 98 9.1 106 45 6.2 


32 49 50 60 82 63 3.8 6.0 


a. Construct a stem-and-leaf display of the data. 
What is a typical, or representative, flow rate? 

c. Does the display appear to be highly concentrated or 
spread out? 

d. Does the distribution of values appear to be reason- 
ably symmetric? If not, how would you describe the 
departure from symmetry? 

e. Would you describe any observation as being far 
from the rest of the data (an outlier)? 


Do running times of American movies differ somehow 
from running times of French movies? The author 
investigated this question by randomly selecting 25 
recent movies of each type, resulting in the following 
running times: 


94 90 95 93 128 95 125 91 104 116 162 102 90 
110 92 113 116 90 97 103 95 120 109 91 138 
123 116 90 158 122 119 125 90 96 94 137 102 
105 106 95 125 122 103 96 111 81 113 128 93 92 


Construct a comparative stem-and-leaf display by list- 
ing stems in the middle of your paper and then placing 
the Am leaves out to the left and the Fr leaves out to 
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16. 


17. 


18. 
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the right. Then comment on interesting features of the 
display. 
The article cited in Example 1.2 also gave the accompa- 


nying strength observations for cylinders: 


6.1 5.8 7.8 7.1 
78 81 74 85 89 9.8 9.7 


72°92 66 83 7.0 8.3 
14.1 12.6 11.2 


a. Construct a comparative stem-and-leaf display 
(see the previous exercise) of the beam and cylin- 
der data, and then answer the questions in parts 
(b)-(d) of Exercise 10 for the observations on 
cylinders. 

b. In what ways are the two sides of the display similar? 
Are there any obvious differences between the beam 
observations and the cylinder observations? 

ec. Construct a dotplot of the cylinder data. 


The accompanying data came from a study of collusion in 
bidding within the construction industry (‘‘Detection of 
Collusive Behavior,’ J. of Construction Engr. and 
Mgmnt, 2012: 1251-1258). 


No. Bidders No. Contracts 
2 7 
3 20 
4 26 
5 16 
6 ll 
7 9 
8 6 
9 8 

10 3 
11 2 


a. What proportion of the contracts involved at most 
five bidders? At least five bidders? 

b. What proportion of the contracts involved between 
five and 10 bidders, inclusive? Strictly between five 
and 10 bidders? 

c. Construct a histogram and comment on interesting 
features. 


Every corporation has a governing board of directors. 
The number of individuals on a board varies from one 
corporation to another. One of the authors of the article 
“Does Optimal Corporate Board Size Exist? An 
Empirical Analysis” (J. of Applied Finance, 2010: 
57-69) provided the accompanying data on the number 
of directors on each board in a random sample of 204 
corporations. 


No. directors: 4 5 6 7 8 9 
Frequency: 3 12 13 25 24 42 
No. directors: 10 11 12 13 14 15 
Frequency: 23 19 16 11 5 4 
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20. 


No. directors: 16 17 21 24 32 
Frequency: 1 3 1 1 1 


a. Construct a histogram of the data based on rela- 
tive frequencies and comment on any interesting 
features. 

b. Construct a frequency distribution in which the 
last row includes all boards with at least 18 direc- 
tors. If this distribution had appeared in the cited 
article, would you be able to draw a histogram? 
Explain. 

c. What proportion of these corporations have at most 
10 directors? 

d. What proportion of these corporations have more 
than 15 directors? 


The number of contaminating particles on a silicon wafer 
prior to a certain rinsing process was determined for each 
wafer in a sample of size 100, resulting in the following 
frequencies: 


Number of particles O 1 2 3 4 5 6 7 
Frequency 1 2 3 12 11 15 18 10 
Number of particles 8 9 10 11 12 13 14 
Frequency 124 5 3 1 2 #1 


a. What proportion of the sampled wafers had at least 
one particle? At least five particles? 

b. What proportion of the sampled wafers had between 
five and ten particles, inclusive? Strictly between five 
and ten particles? 

c. Draw a histogram using relative frequency on the 
vertical axis. How would you describe the shape of the 
histogram? 


The article “Determination of Most Representative 
Subdivision” (J. of Energy Engr., 1993: 43-55) gave 
data on various characteristics of subdivisions that could 
be used in deciding whether to provide electrical power 
using overhead lines or underground lines. Here are the 
values of the variable x = total length of streets within a 
subdivision: 


1280 5320 4390 2100 1240 3060 4770 
1050 360 3330 3380 340 1000 960 
1320 530 3350 540 3870 1250 2400 
960 1120 2120 450 2250 2320 2400 
3150 5700 5220 500 1850 2460 5850 
2700 2730 1670 100 5770 3150 1890 
510 240 396 1419 2109 


a. Construct a stem-and-leaf display using the thou- 
sands digit as the stem and the hundreds digit as the 
leaf, and comment on the various features of the 
display. 

b. Construct a histogram using class boundaries 0, 1000, 
2000, 3000, 4000, 5000, and 6000. What proportion 


of subdivisions have total length less than 2000? 
Between 2000 and 4000? How would you describe 
the shape of the histogram? 


21. The article cited in Exercise 20 also gave the following 
values of the variables y = number of culs-de-sac and 
z = number of intersections: 


0O020111210011 
53004400121 
11201221102 
01324660118 
0 
3 


Or OF Fe 


110 

00 0 

a. Construct a histogram for the y data. What propor- 
tion of these subdivisions had no culs-de-sac? At 
least one cul-de-sac? 

b. Construct a histogram for the z data. What propor- 
tion of these subdivisions had at most five intersec- 
tions? Fewer than five intersections? 


22. How does the speed of a runner vary over the course of 
a marathon (a distance of 42.195 km)? Consider deter- 
mining both the time to run the first 5 km and the time to 
run between the 35-km and 40-km points, and then sub- 
tracting the former time from the latter time. A positive 
value of this difference corresponds to a runner slowing 
down toward the end of the race. The accompanying 
histogram is based on times of runners who participated 
in several different Japanese marathons (‘‘Factors 
Affecting Runners’ Marathon Performance,’ Chance, 
Fall, 1993: 24-30). 


What are some interesting features of this histogram? 
What is a typical difference value? Roughly what pro- 
portion of the runners ran the late distance more quickly 
than the early distance? 


Histogram for Exercise 22 


25. 
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23. The article “Statistical Modeling of the Time Course 
of Tantrum Anger” (Annals of Applied Stats, 2009: 
1013-1034) discussed how anger intensity 


24. 
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in children’s tantrums could be related to tantrum 
duration as well as behavioral indicators such as 
shouting, stamping, and pushing or pulling. The fol- 
lowing frequency distribution was given (and also 
the corresponding histogram): 


0-<2: 136 2-<4: 92 4-<ll: 71 
11—<20: 26 20—<30: 7 30—-<40: 3 


Draw the histogram and then comment on any interest- 
ing features. 


The accompanying data set consists of observations 
on shear strength (1b) of ultrasonic spot welds made 
on a certain type of alclad sheet. Construct a relative 
frequency histogram based on ten equal-width classes 
with boundaries 4000, 4200, .... [The histogram will 
agree with the one in “Comparison of Properties of 
Joints Prepared by Ultrasonic Welding and Other 
Means” (J. of Aircraft, 1983: 552-556).] Comment 
on its features. 


5434 4948 4521 4570 4990 5702 5241 
5112 5015 4659 4806 4637 5670 4381 
4820 5043 4886 4599 5288 5299 4848 
5378 5260 5055 5828 5218 4859 4780 
5027 5008 4609 4772 5133 5095 4618 
4848 5089 5518 5333 5164 5342 5069 
4755 4925 5001 4803 4951 5679 5256 
5207 5621 4918 5138 4786 4500 5461 
5049 4974 4592 4173 5296 4965 5170 
4740 5173 4568 5653 5078 4900 4968 
5248 5245 4723 5275 5419 5205 4452 
5227 5555 5388 5498 4681 5076 4774 
4931 4493 5309 5582 4308 4823 4417 
5364 5640 5069 5188 5764 5273 5042 
5189 4986 


A transformation of data values by means of some 
mathematical function, such as Vx or 1/x, can often 
yield a set of numbers that has “nicer” statistical prop- 
erties than the original data. In particular, it may be 
possible to find a function for which the histogram of 
transformed values is more symmetric (or, even better, 
more like a bell-shaped curve) than the original data. 
As an example, the article “Time Lapse 
Cinematographic Analysis of Beryllium—Lung 
Fibroblast Interactions” (Environ. Research, 
1983: 34-43) reported the results of experiments 
designed to study the behavior of certain individual 
cells that had been exposed to beryllium. An important 
characteristic of such an individual cell is its interdi- 
vision time (IDT). IDTs were determined for a large 
number of cells, both in exposed (treatment) and unex- 
posed (control) conditions. The authors of the article 
used a logarithmic transformation, that is, 
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transformed value = log(original value). Consider the 
following representative IDT data: 


IDT log,(DT) IDT log,(IDT) IDT  log,(IDT) 
28.1 1.45 60.1 1.78 21.0 1.32 
31.2 1.49 23.7 1.37 22.3 1.35 
13.7 1.14 18.6 1.27 15.5 1.19 
46.0 1.66 21.4 1.33 36.3 1.56 
25.8 1.41 26.6 1.42 19.1 1.28 
16.8 1.23 26.2 1.42 38.4 1.58 
34.8 1.54 32.0 isi 728 1.86 
62.3 1.79 43.5 1.64 48.9 1.69 
28.0 1.45 17.4 1.24 21.4 1.33 
17.9 1.25 38.8 1.59 20.7 1.32 
19.5 1.29 30.6 1.49 57.3 1.76 
21.1 1.32 55.6 1.75 40.9 1.61 
31.9 1.50 25.5 Al 

28.9 1.46 52.1 1.72 


26. 


27. 


Use class intervals 10O—<20, 20—<30,... to construct 
a histogram of the original data. Use intervals 
1.1—<1.2, 1.2—<1.3,... to do the same for the trans- 
formed data. What is the effect of the transformation? 


Automated electron backscattered diffraction is now 
being used in the study of fracture phenomena. The fol- 
lowing information on misorientation angle (degrees) 
was extracted from the article “Observations on the 
Faceted Initiation Site in the Dwell-Fatigue Tested 
Ti-6242 Alloy: Crystallographic Orientation and Size 
Effects” (Metallurgical and Materials Trans., 2006: 
1507-1518). 


Class: 0-<5 5-<10 10-<15 15—<20 
Rel freq, 177 166 175 136 
Class: 20-<30 30-<40 40-—<60 60—<90 
Rel freq. 194 078 044 030 


a. Is it true that more than 50% of the sampled angles 
are smaller than 15°, as asserted in the paper? 

b. What proportion of the sampled angles are at least 
30°? 

c. Roughly what proportion of angles are between 
10° and 25°? 

d. Construct a histogram and comment on any interest- 
ing features. 


The article “Study on the Life Distribution of 
Microdrills” (J. of Engr. Manufacture, 2002: 301- 
305) reported the following observations, listed in 
increasing order, on drill lifetime (number of holes 
that a drill machines before it breaks) when holes were 
drilled in a certain brass alloy. 
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29. 


11 14 20 23 31 36 39 44 47 = 50 
59 61 65 67 68 71 74 %76 78 79 


81 84 85 89 91 93 96 99 101 104 
105 105 112 118 123 136 139 141 148 158 
161 168 184 206 248 263 289 322 388 513 


a. Why can a frequency distribution not be based on 
the class intervals 0-50, 50-100, 100-150, and 
so on? 

b. Construct a frequency distribution and histogram of 
the data using class boundaries 0, 50, 100,..., and 
then comment on interesting characteristics. 

c. Construct a frequency distribution and histogram of 
the natural logarithms of the lifetime observations, 
and comment on interesting characteristics. 

d. What proportion of the lifetime observations in this 
sample are less than 100? What proportion of the 
observations are at least 200? 


The accompanying frequency distribution on deposited 
energy (mJ) was extracted from the article ‘Experimental 
Analysis of Laser-Induced Spark Ignition of Lean 
Turbulent Premixed Flames” (Combustion and Flame, 
2013: 1414-1427). 


1.0-<2.0 5 2.0-<2.4 11 
2.4-<2.6 13 2.6-<2.8 30 
2.8-<3.0 46 3.0-<3.2 66 
3.2-<3.4 133 3.4-<3.6 141 
3.6-<3.8 126 3.8-<4.0 92 
4.0-<4.2 73 4.2-<44 38 
44-<4.6 19 4.6-<5.0 11 


a. What proportion of these ignition trials resulted in a 
deposited energy of less than 3 mJ? 

b. What proportion of these ignition trials resulted in a 
deposited energy of at least 4 mJ? 

c. Roughly what proportion of the trials resulted in a 
deposited energy of at least 3.5 mJ? 

d. Construct a histogram and comment on its shape. 


The following categories for type of physical activity 
involved when an industrial accident occurred appeared 
in the article “Finding Occupational Accident Patterns 
in the Extractive Industry Using a Systematic Data 
Mining Approach” (Reliability Engr. and System 
Safety, 2012: 108-122): 

Working with handheld tools 

Movement 

Carrying by hand 

Handling of objects 

Operating a machine 

Other 


moO e > 


Construct a frequency distribution, including relative 
frequencies, and histogram for the accompanying data 
from 100 accidents (the percentages agree with those in 
the cited article): 
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A BODAA FCACBEBAC 25, 38, and 50, and the cumulative relative frequencies 
F DBC§#éODAA CBEBCEA are .18, .50, .76, and 1.00. Compute the cumulative 
BAAABCCODFODBBAE frequencies and cumulative relative frequencies for the 
Cc BACBEEDABCEAA data of Exercise 24. 
FC BDODODBODCA F A AB 32. Fire load (MJ/m?) is the heat energy that could be 
DEA EOD BCA FA CODODA released per square meter of floor area by combustion 
ABA FODC§G§ACBFODAEA of contents and the structure itself. The article ‘‘Fire 
cD Loads in Office Buildings” (J. of Structural Engr., 
1997: 365-368) gave the following cumulative percent- 
30. A Pareto diagram is a variation of a histogram for ages (read from a graph) for fire loads in a sample of 
categorical data resulting from a quality control study. 388 rooms: 


Each category represents a different type of product non- 


conformity or production problem. The categories are Value ; 0 150 300 450 600 
ordered so that the one with the largest frequency Cumulative % 0 193 37.6 62.7 77.5 
appears on the far left, then the category with the second Value 750 900 1050 1200 1350 
largest frequency, and so on. Suppose the following Cumulative % 87.2 93.8 95.7 98.6 99.1 
information on nonconformities in circuit packs is Value 1500 1650 1800 1950 


obtained: failed component, 126; incorrect component, 


210; insufficient solder, 67; excess solder, 54; missing Cumulative % 99.5 99.6 29.8 100.0 


component, 131. Construct a Pareto diagram. a. Construct a relative frequency histogram and com- 
31. The cumulative frequency and cumulative relative ment on interesting features. 

frequency for a particular class interval are the sum of b. What proportion of fire loads are less than 600? At 

frequencies and relative frequencies, respectively, for least 1200? 

that interval and all intervals lying below it. If, for c. What proportion of the loads are between 600 and 

example, there are four intervals with frequencies 9, 1200? 


16, 13, and 12, then the cumulative frequencies are 9, 


1.3 Measures of Location 


Visual summaries of data are excellent tools for obtaining preliminary impres- 
sions and insights. More formal data analysis often requires the calculation and 
interpretation of numerical summary measures. That is, from the data we try to 
extract several summarizing quantities that might serve to characterize the data 
set and convey some of its prominent features. Our primary concern will be with 
numerical data; some comments regarding categorical data appear at the end of 
the section. 

Suppose, then, that our data set is of the form x,,x,,...,x,, where each x; is 
a number. What features of such a set of numbers are of most interest and deserve 
emphasis? One important characteristic of a set of numbers is its location, and in 
particular its center. This section presents methods for describing the location of a 
data set; in Section 1.4 we will turn to methods for assessing variability in a set of 
numbers. 


The Mean 


For a given set of numbers x,, x,,..., x, the most familiar and useful measure of the 
center is the mean, or arithmetic average of the set. Because we will almost always 
think of the x,’s as constituting a sample, we will often refer to the arithmetic average 
as the sample mean and denote it by x. 
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DEFINITION The sample mean x of observations x,, X5,..., x, 1S given by 


n i=1 


See sie ee ate 
x= 
n n 


The numerator of x can be written more informally as 2x, where the sum- 
mation is over all sample observations. 


For reporting x, we recommend using decimal accuracy of one digit more than the 
accuracy of the x,’s. Thus if observations are stopping distances with x, = 125, 
x, = 131, and so on, we might have x = 127.3 ft. 


EXAMPLE 1.14 Recent years have seen growing commercial interest in the use of what is known 
as internally cured concrete. This concrete contains porous inclusions most com- 
monly in the form of lightweight aggregate (LWA). The article “Characterizing 
Lightweight Aggregate Desorption at High Relative Humidities Using a 
Pressure Plate Apparatus” (J. of Materials in Civil Engr, 2012: 961-969) reported 
on a study in which researchers examined various physical properties of 14 LWA 
specimens. Here are the 24-hour water-absorption percentages for the specimens: 


x, = 16.0 x, = 30.5 x3 = 17.7 x4 = 17.5 x5 = 14.1 
xX, = 10.0 x, = 15.6 xX, = 15.0 X= 19.1 xy = 17.9 
x1, = 18.9 Xy = 18.5 i= 122 x4 = 6.0 


Figure 1.14 shows a dotplot of the data; a water-absorption percentage in the mid- 
teens appears to be “typical.” With Xx, = 229.0, the sample mean is 


A physical interpretation of the sample mean demonstrates how it assesses the center 
of a sample. Think of each dot in the dotplot as representing a 1-lb weight. Then a 
fulcrum placed with its tip on the horizontal axis will balance precisely when it is 
located at x. So the sample mean can be regarded as the balance point of the distri- 
bution of observations. 


x = 16.36 
= ee + 
10 A 20 30 40 
Figure 1.14 Dotplot of the data from Example 1.14 | 


Just as x represents the average value of the observations in a sample, the 
average of all values in the population can be calculated. This average is called the 
population mean and is denoted by the Greek letter x. When there are N values in 
the population (a finite population), then x = (sum of the N population values)/N. 
We will give a more general definition for w in Chapters 3 and 4 that applies to 
both finite and (conceptually) infinite populations. Just as x is an interesting and 
important measure of sample location, w is an interesting and important (often 
the most important) characteristic of a population. One of our first tasks in statisti- 
cal inference will be to present methods based on the sample mean for drawing 
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DEFINITION 


EXAMPLE 1.15 
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conclusions about a population mean. For example, we might use the sample mean 
x = 16.36 computed in Example 1.14 as a point estimate (a single number that is 
our “best” guess) of 2 = the true average water-absorption percentage for all speci- 
mens treated as described. 

The mean suffers from one deficiency that makes it an inappropriate measure 
of center under some circumstances: Its value can be greatly affected by the pres- 
ence of even a single outlier (unusually large or small observation). For example, 
if a sample of employees contains nine who earn $50,000 per year and one whose 
yearly salary is $150,000, the sample mean salary is $60,000; this value certainly 
does not seem representative of the data. In such situations, it is desirable to employ 
a measure that is less sensitive to outlying values than x, and we will momentar- 
ily propose one. However, although x does have this potential defect, it is still the 
most widely used measure, largely because there are many populations for which 
an extreme outlier in the sample would be highly unlikely. When sampling from 
such a population (a normal or bell-shaped population being the most important 
example), the sample mean will tend to be stable and quite representative of the 
sample. 


The Median 


The word median is synonymous with “middle,” and the sample median is indeed the 
middle value once the observations are ordered from smallest to largest. When the 
observations are denoted by x,,...,x,, we will use the symbol xX to represent the 
sample median. 


The sample median is obtained by first ordering the n observations from 
smallest to largest (with any repeated values included so that every sample 
observation appears in the ordered list). Then, 


The single 
middle (" ae 


value if n = 

is odd : 
The average 
of the two 
middle 
values if n 
is even 


th 
ordered value 


n th n th 
= average of 5 and 2 +1] ordered values 


People not familiar with classical music might tend to believe that a composer’s 
instructions for playing a particular piece are so specific that the duration would 
not depend at all on the performer(s). However, there is typically plenty of room 
for interpretation, and orchestral conductors and musicians take full advantage 
of this. The author went to the Web site ArkivMusic.com and selected a sample 
of 12 recordings of Beethoven’s Symphony No. 9 (the “Choral,” a stunningly 
beautiful work), yielding the following durations (min) listed in increasing 
order: 


62.3 62.8 63.6 65.2 65.7 66.4 67.4 68.4 68.8 70.8 75.7 79.0 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 


Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


32 CHAPTER 1 Overview and Descriptive Statistics 


Here is a dotplot of the data: 


60 65 70 75 80 
Duration 


Figure 1.15 Dotplot of the data from Example 1.15 


Since n = 12 is even, the sample median is the average of the n/2 = 6" and 
(n/2 + 1) = 7" values from the ordered list: 


4 + 67.4 
yo GATE oop 


Note that if the largest observation 79.0 had not been included in the sample, the result- 
ing sample median for the n = 11 remaining observations would have been the single 
middle value 66.4 (the [n + 1]/2 = 6" ordered value, i.e., the 6'" value in from either 
end of the ordered list). The sample mean is x = =x, = 816.1/12 = 68.01, roughly a 
minute larger than the median. The mean is pulled out relative to the median because 
the sample “‘stretches out” somewhat more on the upper end than on the lower end. 


The data in Example 1.15 illustrates an important property of X in contrast to 
x: The sample median is very insensitive to outliers. If, for example, the two largest 
X;S are increased from 75.7 and 79.0 to 85.7 and 89.0, respectively, x would be unaf- 
fected. Thus, in the treatment of outlying data values, x and X are at opposite ends 
of a spectrum. Both quantities describe where the data is centered, but they will not 
in general be equal because they focus on different aspects of the sample. 

Analogous to ¥ as the middle value in the sample is a middle value in the 
population, the population median, denoted by p. As with x and yw, we can think 
of using the sample median xX to make an inference about pj. In Example 1.15, we 
might use x = 66.90 as an estimate of the median time for the population of all 
recordings. 

The population mean yw and median mw will not generally be identical. 
If the population distribution is positively or negatively skewed, as pictured in 
Figure 1.16, then w * p . When this is the case, in making inferences we must 
first decide which of the two population characteristics is of greater interest and 
then proceed accordingly. 


Mw fe h=e i mh 


(a) Negative skew (b) Symmetric (c) Positive skew 


Figure 1.16 Three different shapes for a population distribution 


Other Measures of Location: Quartiles, 
Percentiles, and Trimmed Means 


The median (population or sample) divides the data set into two parts of equal 
size. To obtain finer measures of location, we could divide the data into more 
than two such parts. Roughly speaking, quartiles divide the data set into four 
equal parts, with the observations above the third quartile constituting the upper 
quarter of the data set, the second quartile being identical to the median, and 
the first quartile separating the lower quarter from the upper three-quarters. 
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Similarly, a data set (sample or population) can be even more finely divided using 
percentiles; the 99th percentile separates the highest 1% from the bottom 99%, 
and so on. Unless the number of observations is a multiple of 100, care must be 
exercised in obtaining percentiles. We will revisit percentiles in Chapter 4 in con- 
nection with certain models for infinite populations. 

The mean is quite sensitive to a single outlier, whereas the median is 
impervious to many outliers. Since extreme behavior of either type might be 
undesirable, we briefly consider alternative measures that are neither as sensi- 
tive as x nor as insensitive as x. To motivate these alternatives, note that x and 
Xx are at opposite extremes of the same “family” of measures. The mean is the 
average of all the data, whereas the median results from eliminating all but the 
middle one or two values and then averaging. To paraphrase, the mean involves 
trimming 0% from each end of the sample, whereas for the median the maximum 
possible amount is trimmed from each end. A trimmed mean is a compromise 
between x and x. A 10% trimmed mean, for example, would be computed by 
eliminating the smallest 10% and the largest 10% of the sample and then averag- 
ing what remains. 


EXAMPLE 1.16 The production of Bidri is a traditional craft of India. Bidri wares (bowls, vessels, 
and so on) are cast from an alloy containing primarily zinc along with some copper. 
Consider the following observations on copper content (%) for a sample of Bidri 
artifacts in London’s Victoria and Albert Museum (‘‘Enigmas of Bidri,’ Surface 
Engr., 2005: 333-339), listed in increasing order: 


20 24 25 26 26 2.7 2.7 2.8 30 3.1 3.2 33 3.3 
34 34 36 36 36 36 3.7 44 46 4.7 48 5.3 10.1 


Figure 1.17 is a dotplot of the data. A prominent feature is the single outlier at the 
upper end; the distribution is somewhat sparser in the region of larger values than 
is the case for smaller values. The sample mean and median are 3.65 and 3.35, 
respectively. A trimmed mean with a trimming percentage of 100(2/26) = 7.7% 
results from eliminating the two smallest and two largest observations; this gives 
X17.) = 3.42. Trimming here eliminates the larger outlier and so pulls the trimmed 
mean toward the median. 


T T T T T T T T T T T 
1 2 3 4 5 6 7 8 9 10 11 
x: 
Xtr(77) 
x 
Figure 1.17 Dotplot of copper contents from Example 1.16 || 


A trimmed mean with a moderate trimming percentage—someplace between 
5% and 25%—will yield a measure of center that is neither as sensitive to outliers 
as is the mean nor as insensitive as the median. If the desired trimming percentage 
is 100a@% and na is not an integer, the trimmed mean must be calculated by inter- 
polation. For example, consider a = .10 for a 10% trimming percentage and n = 26 
as in Example 1.16. Then x,,,;9, would be the appropriate weighted average of the 
7.7% trimmed mean calculated there and the 11.5% trimmed mean resulting from 
trimming three observations from each end. 
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Categorical Data and Sample Proportions 


When the data is categorical, a frequency distribution or relative frequency dis- 
tribution provides an effective tabular summary of the data. The natural numeri- 
cal summary quantities in this situation are the individual frequencies and the 
relative frequencies. For example, if a survey of individuals who own digital 
cameras is undertaken to study brand preference, then each individual in the 
sample would identify the brand of camera that he or she owned, from which we 
could count the number owning Canon, Sony, Kodak, and so on. Consider sam- 
pling a dichotomous population—one that consists of only two categories (such 
as voted or did not vote in the last election, does or does not own a digital cam- 
era, etc.). If we let x denote the number in the sample falling in category 1, then 
the number in category 2 is n — x. The relative frequency or sample proportion in 
category | is x/n and the sample proportion in category 2 is 1 — x/n. Let’s denote 
a response that falls in category | by a 1 and a response that falls in category 2 by 
a0. A sample size of n = 10 might then yield the responses 1, 1, 0, 1, 1, 1, 0, 0, 
1, 1. The sample mean for this numerical sample is (since number of 1s = x = 7) 
Xptor tx, Lt1t+04+-+14+1 #7 x , ; 
Ps 10 io sample proportion 

More generally, focus attention on a particular category and code the sample 
results so that a | is recorded for an observation in the category and a 0 for an obser- 
vation not in the category. Then the sample proportion of observations in the category 
is the sample mean of the sequence of \’s and 0’s. Thus a sample mean can be used to 
summarize the results of a categorical sample. These remarks also apply to situations 
in which categories are defined by grouping values in a numerical sample or population 
(e.g., we might be interested in knowing whether individuals have owned their present 
automobile for at least 5 years, rather than studying the exact length of ownership). 

Analogous to the sample proportion x/n of individuals or objects falling in a 
particular category, let p represent the proportion of those in the entire population 
falling in the category. As with x/n, p is a quantity between 0 and 1, and while x/n 
is a sample characteristic, p is a characteristic of the population. The relationship 
between the two parallels the relationship between X and p and between x and p. 
In particular, we will subsequently use x/n to make inferences about p. If a sample 
of 100 students from a large university reveals that 38 have Macintosh computers, 
then we could use 38/100 = .38 as a point estimate of the proportion of all students 
at the university who have Macs. Or we might ask whether this sample provides 
strong evidence for concluding that at least 1/3 of all students are Mac owners. 
With k categories (k > 2), we can use the k sample proportions to answer questions 


about the population proportions p,,..., P;- 
EXERCISES Section 1.3 (33-43) 

33. The May 1, 2009, issue of The Montclarian reported the b. Suppose the 6" observation had been 985 rather than 
following home sale amounts for a sample of homes in 1285. How would the mean and median change? 
Alameda, CA that were sold the previous month (1000s c. Calculate a 20% trimmed mean by first trimming the 
of $): two smallest and two largest observations. 

590 815 575 608 350 1285 408 540 555 679 d. Calculate a 15% trimmed mean. 
a. Calculate and interpret the sample mean and 34. Exposure to microbial products, especially endotoxin, 


median. may have an impact on vulnerability to allergic diseases. 
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U; 
F: 


The article “Dust Sampling Methods for Endotoxin— 
An Essential, But Underestimated Issue”? (Indoor Air, 
2006: 20-27) considered various issues associated with 
determining endotoxin concentration. The following data 
on concentration (EU/mg) in settled dust for one sample 
of urban homes and another of farm homes was kindly 
supplied by the authors of the cited article. 


6.0 5.0 11.0 33.0 4.0 5.0 80.0 18.0 35.0 17.0 23.0 
4.0 14.0 11.0 9.0 9.0 8.0 4.0 20.0 5.0 8.9 21.0 
9.2 3.0 2.0 0.3 


a. Determine the sample mean for each sample. How 
do they compare? 

b. Determine the sample median for each sample. How 
do they compare? Why is the median for the urban 
sample so different from the mean for that sample? 

c. Calculate the trimmed mean for each sample by 
deleting the smallest and largest observation. What 
are the corresponding trimming percentages? How 
do the values of these trimmed means compare to the 
corresponding means and medians? 


35. Mercury is a persistent and dispersive environmental con- 


36. 


taminant found in many ecosystems around the world. 
When released as an industrial by-product, it often finds its 
way into aquatic systems where it can have deleterious 
effects on various avian and aquatic species. The accompa- 
nying data on blood mercury concentration (wg/g) for adult 
females near contaminated rivers in Virginia was read from 
a graph in the article “Mercury Exposure Effects the 
Reproductive Success of a Free-Living Terrestrial 
Songbird, the Carolina Wren” (The Auk, 2011: 759-769; 
this is a publication of the American Ornithologists’ Union). 


20 22 25 30 34 41 55 56 
142 1.70 1.83 2.20 2.25 3.07 3.25 


a. Determine the values of the sample mean and sam- 
ple median and explain why they are different. 
[Hint: Xx, = 18.55.] 

b. Determine the value of the 10% trimmed mean and 
compare to the mean and median. 

c. By how much could the observation .20 be increased 
without impacting the value of the sample median? 


A sample of 26 offshore oil workers took part in a simu- 
lated escape exercise, resulting in the accompanying data 
on time (sec) to complete the escape (“Oxygen 
Consumption and Ventilation During Escape from an 
Offshore Platform,” Ergonomics, 1997: 281-292): 


389 356 359 363 
373 373 370 364 
392 369 374 359 


375 424 325 394 402 
366 364 325 339 393 
356 403 334 397 


a. Construct a stem-and-leaf display of the data. How 
does it suggest that the sample mean and median will 
compare? 


37. 


38. 


39. 


-736 
1.011 


40. 
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b. Calculate the values of the sample mean and median. 
[Hint: Xx; = 9638.] 

c. By how much could the largest time, currently 424, 
be increased without affecting the value of the sam- 
ple median? By how much could this value be 
decreased without affecting the value of the sample 
median? 

d. What are the values of x and x when the observa- 
tions are reexpressed in minutes? 


The article “Snow Cover and Temperature 
Relationships in North America and Eurasia” (J. 
Climate and Applied Meteorology, 1983: 460-469) used 
statistical techniques to relate the amount of snow cover 
on each continent to average continental temperature. 
Data presented there included the following ten observa- 
tions on October snow cover for Eurasia during the years 
1970-1979 (in million km7): 


6.5 12.0 14.9 10.0 10.7 7.9 21.9 12.5 14.5 9.2 


What would you report as a representative, or typical, 
value of October snow cover for this period, and what 
prompted your choice? 


Blood pressure values are often reported to the nearest 
5 mmHg (100, 105, 110, etc.). Suppose the actual blood 
pressure values for nine randomly selected individuals are 


118.6 
131.5 


127.4 
133.2 


138.4 130.0 113.7) 122.0 108.3 


a. What is the median of the reported blood pressure 
values? 

b. Suppose the blood pressure of the second individual 
is 127.6 rather than 127.4 (a small change in a single 
value). How does this affect the median of the 
reported values? What does this say about the sensi- 
tivity of the median to rounding or grouping in the 
data? 


The propagation of fatigue cracks in various aircraft parts 
has been the subject of extensive study in recent years. 
The accompanying data consists of propagation lives 
(flight hours/10*) to reach a given crack size in fastener 
holes intended for use in military aircraft (‘‘Statistical 
Crack Propagation in Fastener Holes Under Spectrum 
Loading,” J. Aircraft, 1983: 1028-1032): 


.863 
1.064 


.865 
1.109 


913 
1.132 


O15 
1.140 


.937 
1.153 


.983 
1.253 


1.007 
1.394 


a. Compute and compare the values of the sample mean 
and median. 

b. By how much could the largest sample observation 
be decreased without affecting the value of the 
median? 


Compute the sample median, 25% trimmed mean, 10% 
trimmed mean, and sample mean for the lifetime data 
given in Exercise 27, and compare these measures. 
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41. 
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A sample of n = 10 automobiles was selected, and each 
was subjected to a 5-mph crash test. Denoting a car with 
no visible damage by S (for success) and a car with such 


b. If each x; is multiplied by a constant c, yielding 
y; = cx,, answer the question of part (a). Again, verify 
your conjectures. 


damage by F, results were as follows: 43. 
S S F 8S S§ S F FS §8S 
a. What is the value of the sample proportion of suc- 
cesses x/n? 
b. Replace each S with a 1 and each F with a 0. Then 
calculate x for this numerically coded sample. How 
does x compare to x/n? 


An experiment to study the lifetime (in hours) for a 
certain type of component involved putting ten com- 
ponents into operation and observing them for 100 
hours. Eight of the components failed during that 
period, and those lifetimes were recorded. Denote the 
lifetimes of the two components still functioning after 
100 hours by 100+. The resulting sample observations 
were 

48 79 17 29 


Which of the measures of center discussed in this sec- 
tion can be calculated, and what are the values of those 
measures? [Note: The data from this experiment is said 
to be “censored on the right.”] 


c. Suppose it is decided to include 15 more cars in the 
experiment. How many of these would have to be S’s 
to give x/n = .80 for the entire sample of 25 cars? 


100+ 35 92 86 57 100+ 


42. a. Ifa constant c is added to each x, in a sample, 
yielding y,=x, +c, how do the sample mean 
and median of the y,s relate to the mean and 
median of the x,s? Verify your conjectures. 


1.4 Measures of Variability 


Reporting a measure of center gives only partial information about a data set or 
distribution. Several samples or populations may have identical measures of center 
yet differ from one another in other important ways. Figure 1.18 shows dotplots of 
three samples with the same mean and median, yet the extent of spread about the 
center is different for all three samples. The first sample has the largest amount of 
variability, the third has the smallest amount, and the second is intermediate to the 
other two in this respect. 


1: * ok * ES * * * ok 
2° 0 oO 60000 0 oO 
3: e oe ee000 © e 


30 40 50 60 70 


Figure 1.18 Samples with identical measures of center but different amounts of variability 


Measures of Variability for Sample Data 


The simplest measure of variability in a sample is the range, which is the difference 
between the largest and smallest sample values. The value of the range for sample 1 
in Figure 1.18 is much larger than it is for sample 3, reflecting more variability in the 
first sample than in the third. A defect of the range, though, is that it depends on only 
the two most extreme observations and disregards the positions of the remaining val- 
ues. Samples | and 2 in Figure 1.18 have identical ranges, yet when the observations 
between the two extremes are taken into account, there is much less variability or 
dispersion in the second sample than in the first. 

Our primary measures of variability involve the deviations from the mean, 
X; — X,X) — X,...,x, — xX. That is, the deviations from the mean are obtained by 
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subtracting x from each of the n sample observations. A deviation will be positive 
if the observation is larger than the mean (to the right of the mean on the measure- 
ment axis) and negative if the observation is smaller than the mean. If all the devia- 
tions are small in magnitude, then all x,’s are close to the mean and there is little 
variability. Alternatively, if some of the deviations are large in magnitude, then 
some x;,’s lie far from x, suggesting a greater amount of variability. A simple way 
to combine the deviations into a single quantity is to average them. Unfortunately, 
this is a bad idea: 
n 
sum of deviations = G@ —x)=0 

i=1 
so that the average deviation is always zero. The verification uses several standard 
rules of summation and the fact that 2x =x +x +++ +x = nx: 


Se a ees of ; Ss) =0 


There are several ways to prevent negative and positive deviations from coun- 
teracting one another when they are combined. One possibility is to work 
with the absolute values of the deviations and calculate the average abso- 
lute deviation =|x;— x|/n. Because the absolute value operation leads to a 
number of theoretical difficulties, consider instead the squared deviations 
(x, — x), (x, — X)’,..., (x, — X)?. Rather than use the average squared deviation 
x(x; — x)?/n, for several reasons we divide the sum of squared deviations by 
n — | instead of n. 


DEFINITION The sample variance, denoted by s?, is given by 
2 DG a a) Sve 
S — = 


i= I i II 


The sample standard deviation, denoted by s, is the (positive) square root of 
the variance: 


s=V8 


Note that s? and s are both nonnegative. One appealing property of the standard 
deviation is that the unit for s is the same as the unit for each of the x,’s. If, for exam- 
ple, the observations are fuel efficiencies in miles per gallon, then we might have 
s = 2.0 mpg. The sample standard deviation can be interpreted as roughly the size of 
a typical or representative deviation from the sample mean within the given sample. 
Thus if s = 2.0 mpg, then some x,’s in the sample are closer than 2.0 to x, whereas 
others are farther away; 2.0 is a representative (or “standard”’) deviation from the 
mean fuel efficiency. If s = 3.0 for a second sample of cars of another type, a typical 
deviation in this sample is roughly 1.5 times what it is in the first sample, an indica- 
tion of more variability in the second sample. 


EXAMPLE 1.17 The Web site www.fueleconomy.gov contains a wealth of information about fuel 
characteristics of various vehicles. In addition to EPA mileage ratings, there are 
many vehicles for which users have reported their own values of fuel efficiency 
(mpg). Consider the following sample of n = 11 efficiencies for the 2009 Ford 
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Focus equipped with an automatic transmission (for this model, EPA reports an 
overall rating of 27 mpg—24 mpg for city driving and 33 mpg for highway driving): 


Car X; X= @=x)? 
1 27.3 —5.96 35.522 
2 27.9 =5,36 28.730 
3 32.9 —0.36 0.130 
4 35.2 1.94 3.764 
5 44.9 11.64 135.490 
6 39.9 6.64 44.090 
7 30.0 3:26 10.628 
8 29]. =3:56 12.674 
9 28.5 —4.76 22.658 

10 32.0 —1.26 1.588 
ll 37.6 4.34 18.836 
Sx, = 365.9 Xx, — X) = .04 Sx, — 3)? = 314.110 ¥ = 33.26 


Effects of rounding account for the sum of deviations not being exactly zero. The 
numerator of s? is S,,, = 314.110, from which 
P Sux 314.110 


SS = 31.41 =5, 
o> a SO 


The size of a representative deviation from the sample mean 33.26 is roughly 5.6 mpg. 
Note: Of the nine people who also reported driving behavior, only three did more 
than 80% of their driving in highway mode; we bet you can guess which cars they 
drove. We haven’t a clue why all 11 reported values exceed the EPA figure—maybe 
only drivers with really good fuel efficiencies communicate their results. a 


Motivation for s? 


To explain the rationale for the divisor n — 1 in s?, note first that whereas s* meas- 
ures sample variability, there is a measure of variability in the population called 
the population variance. We will use a7 (the square of the lowercase Greek letter 
sigma) to denote the population variance and o to denote the population standard 
deviation (the square root of a”). The value of o can be interpreted as roughly the 
size of a typical deviation from w within the entire population of x values. When the 
population is finite and consists of N values, 


N 
o = DS; — w)2/N 
i=1 


which is the average of all squared deviations from the population mean (for the 
population, the divisor is N and not N — 1). More general definitions of o? appear 
in Chapters 3 and 4. 

Just as x will be used to make inferences about the population mean p, we 
should define the sample variance so that it can be used to make inferences about 
a”. Now note that o? involves squared deviations about the population mean wp. If we 
actually knew the value of jz, then we could define the sample variance as the average 
squared deviation of the sample x,’s about jz. However, the value of jz is almost never 
known, so the sum of squared deviations about x must be used. But the x,’s tend to be 
closer to their average x than to the population average js. To compensate for this, 
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the divisor n — | is used rather than the sample size n. In other words, if we used a 
divisor n in the sample variance, then the resulting quantity would tend to underesti- 
mate o” (produce estimated values that are too small on the average), whereas dividing 
by the slightly smaller n — 1 corrects this underestimating. 

It is customary to refer to s* as being based on n — 1 degrees of freedom 
(df). This terminology reflects the fact that although s* is based on the n quantities 
X) — X,xX. — X,...,x, — x, these sum to 0, so specifying the values of any n — 1 of 
the quantities determines the remaining value. For example, ifn = 4 and x, — x = 8, 
X, — xX = —6,and x, — x = —4, then automatically x, — x = 2, so only three of the 
four values of x; — x are freely determined (3 df). 


A Computing Formula for s@ 


It is best to obtain s* from statistical software or else use a calculator that allows you 
to enter data into memory and then view s? with a single keystroke. If your calculator 
does not have this capability, there is an alternative formula that avoids calculating 
the deviations. The formula involves both (=x,)?, summing and then squaring, and 
=x?, squaring and then summing. 


An alternative expression for the numerator of s? is 


(a 


Se - LG, ee 0 = Se 7 n 


Proof Because x = =x,/n, n(x)? = (Ex,)?/n. Then, 


S@;=s" = SG? = 2 ye = a — oe 


= Six? — 2k + nk + ne? = Dix? — ny | 


EXAMPLE 1.18 Traumatic knee dislocation often requires surgery to repair ruptured ligaments. One 
measure of recovery is range of motion (measured as the angle formed when, start- 
ing with the leg straight, the knee is bent as far as possible). The given data on post- 
surgical range of motion appeared in the article ‘Reconstruction of the Anterior 
and Posterior Cruciate Ligaments After Knee Dislocation” (Amer. J. Sports 
Med., 1999: 189-197): 


154. 142 137 133 122 126 135 135 108 120 127 134 122 


The sum of these 13 sample observations is =x; = 1695, and the sum of their 
squares is 


Sar = (154)? + (142)? +--- + (122) = 222,581 
Thus the numerator of the sample variance is 
Su = ix? — [(DSxP]/n = 222,581 — (1695)2/13 = 1579.0769 
from which s* = 1579.0769/12 = 131.59 and s = 11.47. a 
Both the defining formula and the computational formula for s? can be sensitive to 
rounding, so as much decimal accuracy as possible should be used in intermediate 


calculations. 
Several other properties of s* enhance understanding and facilitate computation. 
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PROPOSITION Let x,,x,,...,.x,, be a sample and c be any nonzero constant. 


1. If y, =x, +c, y, =x, + ¢,...,y, =, + ¢, then s? = s2, and 
2. If y, = cx),..., ¥, = Cx, then s? = c’s?, s, = |cls, 


n? 


where s is the sample variance of the x’s and s? is the sample variance of the y’s. 


In words, Result | says that the variance is unchanged when a constant c is added 
to (or subtracted from) each data value. This is intuitive, since adding or subtract- 
ing c shifts the location of the data set but leaves distances between data values 
unchanged. According to Result 2, multiplication of each x; by c results in s* being 
multiplied by a factor of c”. These properties can be proved by noting in Result 1 that 
y =x +c and in Result 2 that y = cx. 


Boxplots 


Stem-and-leaf displays and histograms convey rather general impressions about a data 
set, whereas a single summary such as the mean or standard deviation focuses on just 
one aspect of the data. In recent years, a pictorial summary called a boxplot has been 
used successfully to describe several of a data set’s most prominent features. These 
features include (1) center, (2) spread, (3) the extent and nature of any departure from 
symmetry, and (4) identification of “outliers,” observations that lie unusually far from 
the main body of the data. Because even a single outlier can drastically affect the 
values of x and s, a boxplot is based on measures that are “resistant” to the presence 
of a few outliers—the median and a measure of variability called the fourth spread. 


DEFINITION Order the n observations from smallest to largest and separate the smallest half 
from the largest half; the median X is included in both halves if n is odd. Then 
the lower fourth is the median of the smallest half and the upper fourth is 
the median of the largest half. A measure of spread that is resistant to outliers 
is the fourth spread f,, given by 


J, = upper fourth — lower fourth 


Roughly speaking, the fourth spread is unaffected by the positions of those observations 
in the smallest 25% or the largest 25% of the data. Hence it is resistant to outliers. The 
fourths are very similar to quartiles, and the fourth spread is similar to the interquartile 
range, the difference between the upper and lower quartiles. But quartiles are a bit 
more cumbersome than fourths to calculate by hand, and there are several different 
sensible ways to compute the quartiles (so values may vary from one software package 
to another). 
The simplest boxplot is based on the following five-number summary: 


smallestx; lower fourth median upper fourth largest x; 


First, draw a rectangle above a horizontal measurement scale; the left edge of the 
rectangle is above the lower fourth, and the right edge is above the upper fourth 
(so box width = f,). Place a vertical line segment or some other symbol inside the 
rectangle at the location of the median; the position of the median symbol relative to 
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the two edges conveys information about skewness in the middle 50% of the data. 
Finally, draw “whiskers” out from either end of the rectangle to the smallest and 
largest observations. A boxplot with a vertical orientation can also be drawn by mak- 
ing obvious modifications in the construction process. 


EXAMPLE 1.19 The accompanying data consists of observations on the time until failure (1000s 
of hours) for a sample of turbochargers from one type of engine (from ‘“The Beta 
Generalized Weibull Distribution: Properties and Applications,’ Reliability 
Engr. and System Safety, 2012: 5-15). 


1.6 20 26 30 35 39 45 46 48 5.0 
5.1 33 54 56 58 60 60 6.1 6.3 6.5 
6.5 6.7 70 Tl 7.3 73 7.30 #77 7.7 78 
7.9 8.0 8.1 83 84 84 85 87 8.8 9.0 


The five-number summary is as follows. 
smallest: 1.6 lower fourth: 5.05 median: 6.5 upper fourth: 7.85 largest: 9.0 


Figure 1.19 shows Minitab output from a request to describe the data. QI and Q3 
are the lower and upper quartiles, respectively, and IQR (interquartile range) is the 
difference between these quartiles. SE Mean is s/ Vn, the “‘standard error of the 
mean”; it will be important in our subsequent development of several widely used 
procedures for making inferences about the population mean p. 


Variable Count Mean SE Mean StDev Minimum Ql Median Q3 Maximum IQR 
lifetime 40 6.253 0.309 1.956 1.600 5.025 6.500 7.875 9.000 2.850 


Figure 1.19 Minitab description of the turbocharger lifetime data 


Figure 1.20 shows both a dotplot of the data and a boxplot. Both plots indicate that 
there is a reasonable amount of symmetry in the middle 50% of the data, but overall 
values stretch out more toward the low end than toward the high end—a negative 
skew. The box itself is not very narrow, indicating a fair amount of variability in the 
middle half of the data, and the lower whisker is especially long. 


e + e + e — so eee cee e veces ges Seeee ose ce ¢ 
2 3 4 5 6 7 8 9 
Lifetime 
(a) 


1 2 3 4 5 6 i 8 9 
Lifetime 
(b) 
Figure 1.20 (a) Dotplot and (b) Boxplot for the lifetime data |_| 
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Boxplots That Show Outliers 


A boxplot can be embellished to indicate explicitly the presence of outliers. Many 
inferential procedures are based on the assumption that the population distribution 
is normal (a certain type of bell curve). Even a single extreme outlier in the sample 
warns the investigator that such procedures may be unreliable, and the presence of 
several mild outliers conveys the same message. 


DEFINITION Any observation farther than 1.5f, from the closest fourth is an outlier. An outlier 
is extreme if it is more than 3f, from the nearest fourth, and it is mild otherwise. 


Let’s modify our previous construction of a boxplot by drawing a whisker 
out from each end of the box to the smallest and largest observations that are not 
outliers. Now represent each mild outlier by a closed circle and each extreme outlier 
by an open circle. Some statistical computer packages do not distinguish between 
mild and extreme outliers. 


EXAMPLE 1.20 The Clean Water Act and subsequent amendments require that all waters in the 
United States meet specific pollution reduction goals to ensure that water is “fishable 
and swimmable.” The article “Spurious Correlation in the USEPA Rating Curve 
Method for Estimating Pollutant Loads” (J. of Environ. Engr., 2008: 610-618) 
investigated various techniques for estimating pollutant loads in watersheds; the 
authors “discuss the imperative need to use sound statistical methods” for this pur- 
pose. Among the data considered is the following sample of TN (total nitrogen) loads 
(kg N/day) from a particular Chesapeake Bay location, displayed here in increasing 


order. 

9.69 13.16 17.09 18.12 23.70 2407 24.29 26.43 
30.75 31.54 35.07 36.99 40.32 42.51 45.64 48.22 
49.98 50.06 55.02 57.00 5841 61.31 64.25 65.24 
66.14 67.68 81.40 90.80 92.17 92.42 100.82 101.94 
103.61 106.28 106.80 108.69 114.61 120.86 124.54 143.27 
143.75 149.64 167.79 182.50 192.55 193.53 271.57 292.61 
312.45 352.09 371.47 444.68 460.86 563.92 690.11 826.54 

1529.35 


Relevant summary quantities are 
x = 92.17 lower 4" = 45.64 upper 4" = 167.79 
Ff, = 122.15 1.5f, = 183.225 3f, = 366.45 


Subtracting 1.5f, from the lower 4" gives a negative number, and none of the obser- 
vations are negative, so there are no outliers on the lower end of the data. However, 


upper 4" + 1.5f, = 351.015 upper 4" + 3f, = 534.24 


Thus the four largest observations—563.92, 690.11, 826.54, and 1529.35—are 
extreme outliers, and 352.09, 371.47, 444.68, and 460.86 are mild outliers. 

The whiskers in the boxplot in Figure 1.21 extend out to the smallest observa- 
tion, 9.69, on the low end and 312.45, the largest observation that is not an outlier, 
on the upper end. There is some positive skewness in the middle half of the data (the 
median line is somewhat closer to the left edge of the box than to the right edge) and 
a great deal of positive skewness overall. 
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Figure 1.21 A boxplot of the nitrogen load data showing mild and extreme outliers Mi 


Comparative Boxplots 


A comparative or side-by-side boxplot is a very effective way of revealing similari- 
ties and differences between two or more data sets consisting of observations on the 
same variable—fuel efficiency observations for four different types of automobiles, 
crop yields for three different varieties, and so on. 


EXAMPLE 1.21 High levels of sodium in food products represent a growing health concern. The 
accompanying data consists of values of sodium content in one serving of cereal 
for one sample of cereals manufactured by General Mills, another sample manufac- 
tured by Kellogg, and a third sample produced by Post (see the website http://www 
-nutritionresource.com/foodcomp2.cfm?id=0800 rather than visiting your neigh- 
borhood grocery store!). 


G: 211 408 171 178 359 249 205 203 201 223 234 256 218 
K: 143 202 120 229 150 5 207 362 252 275 224 
P: 253 220 212 41 140 215 266 3 214 280 


Figure 1.22 shows a comparative boxplot of the data from the software package R. The 
typical sodium content (median) is roughly the same for all three companies. But the 
distributions differ markedly in other respects. The Kellogg data shows a substantial 
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Figure 1.22 Comparative boxplot of the data in Example 1.21, from R 
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positive skew both in the middle 50% and overall, with two outliers at the upper end. 
The Kellogg data exhibits a negative skew in the middle 50% and a positive skew 
overall, except for the outlier at the low end (this outlier is not identified by Minitab). 
And the Post data is negatively skewed both in the middle 50% and overall with no 
outliers. Variability as assessed by the box length (here the interquartile range rather 
than the fourth spread) is smallest for the G brand and largest for the P brand, with the 
K brand intermediate to the other two; looking instead at standard deviations, s, and 
Sp are roughly the same and both much larger than s,. | 


EXERCISES Section 1.4 (44—61) 


44. 


45. 


46. 


Poly(3-hydroxybutyrate) (PHB), a semicrystalline 
polymer that is fully biodegradable and biocompatible, 
is obtained from renewable resources. From a sustain- 
ability perspective, PHB offers many attractive proper- 
ties though it is more expensive to produce than stan- 
dard plastics. The accompanying data on melting point 
(°C) for each of 12 specimens of the polymer using a 
differential scanning calorimeter appeared in the article 
“The Melting Behaviour of Poly(3-Hydroxybutyrate) 
by DSC. Reproducibility Study” (Polymer Testing, 
2013: 215-220). 


180.5 181.7 180.9 181.6 182.6 181.6 
181.3 182.1 182.1 180.3 181.7 180.5 


Compute the following: 


a. The sample range 

b. The sample variance s? from the definition [Hint: 
First subtract 180 from each observation. | 

c. The sample standard deviation 

d. s? using the shortcut method 


The value of Young’s modulus (GPa) was determined for 
cast plates consisting of certain intermetallic substrates, 
resulting in the following sample observations (“‘Strength 
and Modulus of a Molybdenum-Coated Ti-25Al- 
10Nb-3U-1Mo Intermetallic,” J. of Materials Engr. 
and Performance, 1997: 46-50): 


116.4 115.9 1146 115.2 115.8 


a. Calculate x and the deviations from the mean. 

b. Use the deviations calculated in part (a) to obtain the 
sample variance and the sample standard deviation. 

c. Calculate s* by using the computational formula for 
the numerator S.... 

d. Subtract 100 from each observation to obtain a sam- 
ple of transformed values. Now calculate the sample 
variance of these transformed values, and compare it 
to s? for the original data. 


The article “Effects of Short-Term Warming on Low and 
High Latitude Forest Ant Communities” (Ecoshpere, 
May 2011, Article 62) described an experiment in which 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


47. 


observations on various characteristics were made using 
minichambers of three different types: (1) cooler (PVC 
frames covered with shade cloth), (2) control (PVC frames 
only), and (3) warmer (PVC frames covered with plastic). 
One of the article’s authors kindly supplied the accompany- 
ing data on the difference between air and soil temperatures 
(°C). 


Cooler Control Warmer 
ee 1.92 2.57 
1.43 2.00 2.60 
1.88 2.19 1.93 
1.26 1.12 1.58 
cies 2 1.78 2.30 
1.86 1.84 0.84 
1.90 2.45 2.65 
1.57 2.03 0.12 
Ls 7S 1.52 2.74 
1.72 0.53 2453 
2.41 1.90 2.13 
2.34 2.86 
0.83 2.31 
1.34 Le OL 
A. PS 


a. Compare measures of center for the three different 
samples. 

b. Calculate, interpret, and compare the standard devia- 
tions for the three different samples. 

c. Do the fourth spreads for the three samples convey 
the same message as do the standard deviations 
about relative variability? 

d. Construct a comparative boxplot (which was includ- 
ed in the cited article) and comment on any interest- 
ing features. 


Zinfandel is a popular red wine varietal produced almost 
exclusively in California. It is rather controversial 
among wine connoisseurs because its alcohol content 
varies quite substantially from one producer to another. 
In May 2013, the author went to the website klwines 
.com, randomly selected 10 zinfandels from among the 


48. 


U: 
F: 


325 available, and obtained the following values of alco- 
hol content (%): 


14.8 14.5 
13.7 16.2 


16.1 
14.6 


14.2 
13.8 


15.9 
15.0 


a. Calculate and interpret several measures of center. 

b. Calculate the sample variance using the defining 
formula. 

c. Calculate the sample variance using the shortcut 
formula after subtracting 13 from each observation. 


Exercise 34 presented the following data on endotoxin 
concentration in settled dust both for a sample of urban 
homes and for a sample of farm homes: 


6.0 5.0 11.0 33.0 4.0 5.0 80.0 18.0 35.0 17.0 23.0 
4.0 14.0 11.0 9.0 9.0 8.0 4.0 20.0 5.0 8.9 21.0 
9.2 3.0 2.0 0.3 


a. Determine the value of the sample standard deviation 
for each sample, interpret these values, and then 
contrast variability in the two samples. [Hint: 
2x; = 237.0 for the urban sample and = 128.4 for 
the farm sample, and =x? = 10,079 for the urban 
sample and 1617.94 for the farm sample. ] 

b. Compute the fourth spread for each sample and com- 
pare. Do the fourth spreads convey the same message 
about variability that the standard deviations do? 
Explain. 

c. The authors of the cited article also provided endo- 
toxin concentrations in dust bag dust: 


U: 34.0 49.0 13.0 33.0 24.0 24.0 35.0 104.0 34.0 40.0 38.0 1.0 


F: 


49. 


50. 


2.0 64.0 6.0 17.0 35.0 11.0 17.0 13.0 5.0 27.0 23.0 


28.0 10.0 13.0 0.2 


Construct a comparative boxplot (as did the cited paper) 
and compare and contrast the four samples. 


A study of the relationship between age and various 
visual functions (such as acuity and depth perception) 
reported the following observations on the area of scleral 
lamina (mm*) from human optic nerve heads 
(“Morphometry of Nerve Fiber Bundle Pores in the 
Optic Nerve Head of the Human,” Experimental Eye 
Research, 1988: 559-568): 


2.75 2.62 2.74 3.85 2.34 2.74 3.93 4.21 3.88 
4.33 3.46 4.52 2.43 3.65 2.78 3.56 3.01 


a. Calculate Xx, and =x?. 

b. Use the values calculated in part (a) to compute the 
sample variance s* and then the sample standard 
deviation s. 


In 1997 a woman sued a computer keyboard manufac- 
turer, charging that her repetitive stress injuries were 
caused by the keyboard (Genessy v. Digital Equipment 
Corp.). The injury awarded about $3.5 million for pain 
and suffering, but the court then set aside that award 
as being unreasonable compensation. In making this 


51. 


52. 


53. 


1.4 Measures of Variability 45 


determination, the court identified a “normative” group of 
27 similar cases and specified a reasonable award as one 
within two standard deviations of the mean of the awards 
in the 27 cases. The 27 awards were (in $1000s) 37, 60, 
75, 115, 135, 140, 149, 150, 238, 290, 340, 410, 600, 750, 
750, 750, 1050, 1100, 1139, 1150, 1200, 1200, 1250, 
1576, 1700, 1825, and 2000, from which =x, = 20,179, 
x? = 24,657,511. What is the maximum possible 
amount that could be awarded under the two-standard- 
deviation rule? 


The article “A Thin-Film Oxygen Uptake Test for 
the Evaluation of Automotive Crankcase 
Lubricants” (Lubric. Engr., 1984: 75-83) reported 
the following data on oxidation-induction time (min) 
for various commercial oils: 


87 103 130 160 180 195 132 145 211 105 145 
153 152 138 87 99 93 119 129 


a. Calculate the sample variance and standard 
deviation. 

b. If the observations were reexpressed in hours, what 
would be the resulting values of the sample variance 
and sample standard deviation? Answer without 
actually performing the reexpression. 


The first four deviations from the mean in a sample of 
n = 5 reaction times were .3, .9, 1.0, and 1.3. What is the 
fifth deviation from the mean? Give a sample for which 
these are the five deviations from the mean. 


A mutual fund is a professionally managed invest- 
ment scheme that pools money from many investors 
and invests in a variety of securities. Growth funds 
focus primarily on increasing the value of invest- 
ments, whereas blended funds seek a balance between 
current income and growth. Here is data on the 
expense ratio (expenses as a % of assets, from www 
-morningstar.com) for samples of 20 large-cap bal- 
anced funds and 20 large-cap growth funds (“large- 
cap” refers to the sizes of companies in which the 
funds invest; the population sizes are 825 and 762, 
respectively): 


Bl 1.03 1.23 1.10 1.64 1.30 
1.27 1.25 0.78 1.05 0.64 
0.94 2.86 1.05 0.75 0.09 
0.79 1.61 1.26 0.93 0.84 
Gr 0.52 1.06 1.26 Zl 155 
0.99 1.10 1.07 1.81 2.05 
0.91 0.79 1.39 0.62 1.52 
1.02 1.10 1.78 1.01 1.15 


a. Calculate and compare the values of x, X , and s for 
the two types of funds. 

b. Construct a comparative boxplot for the two types of 
funds, and comment on interesting features. 
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16 
98 
172 
294 


55. 


56. 
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Grip is applied to produce normal surface forces that 
compress the object being gripped. Examples include 
two people shaking hands, or a nurse squeezing a 
patient’s forearm to stop bleeding. The article 
“Investigation of Grip Force, Normal Force, Contact 
Area, Hand Size, and Handle Size for Cylindrical 
Handles” (Human Factors, 2008: 734-744) included 
the following data on grip strength (N) for a sample of 42 
individuals: 


18 18 26 33 41 54 56 66 68 87 91 95 
106 109 111 118 127 127 135 145 147 149 151 168 
189 190 200 210 220 229 230 233 238 244 259 
403 


a. Construct a stem-and-leaf display based on repeat- 
ing each stem value twice, and comment on inter- 
esting features. 

b. Determine the values of the fourths and the 
fourthspread. 

c. Construct a boxplot based on the five-number sum- 
mary, and comment on its features. 

d. How large or small does an observation have to be to 
qualify as an outlier? An extreme outlier? Are there 
any outliers? 

e. By how much could the observation 403, currently 
the largest, be decreased without affecting f,? 


Here is a stem-and-leaf display of the escape time data 
introduced in Exercise 36 of this chapter. 


32 a2 

33 49 

34 

35 6699 
36 34469 
37 03345 
38 9 

39 2347 
40 23 

41 

42 4 


a. Determine the value of the fourth spread. 

b. Are there any outliers in the sample? Any extreme 
outliers? 

c. Construct a boxplot and comment on its features. 

d. By how much could the largest observation, cur- 
rently 424, be decreased without affecting the value 
of the fourth spread? 


The following data on distilled alcohol content (%) for 
a sample of 35 port wines was extracted from the arti- 
cle “A Method for the Estimation of Alcohol in 
Fortified Wines Using Hydrometer Baumé and 
Refractometer Brix” (Amer. J. Enol. Vitic., 2006: 
486-490). Each value is an average of two duplicate 
measurements. 
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16.35 18.85 16.20 17.75 19.58 17.73 22.75 23.78 23.25 
19.08 19.62 19.20 20.05 17.85 19.17 19.48 20.00 19.97 
17.48 17.15 19.07 19.90 18.68 18.82 19.03 19.45 19.37 
19.20 18.00 19.60 19.33 21.22 19.50 15.30 22.25 


Use methods from this chapter, including a boxplot that 
shows outliers, to describe and summarize the data. 


A sample of 20 glass bottles of a particular type was 
selected, and the internal pressure strength of each bottle 
was determined. Consider the following partial sample 
information: 


median = 202.2 
upper fourth = 216.8 


lower fourth = 196.0 


125.8 
221.3 


188.1 
230.5 


193.7 
250.2 


Three smallest observations 
Three largest observations 


a. Are there any outliers in the sample? Any extreme 
outliers? 

b. Construct a boxplot that shows outliers, and com- 
ment on any interesting features. 


A company utilizes two different machines to manufac- 
ture parts of a certain type. During a single shift, a sam- 
ple of n= 20 parts produced by each machine is 
obtained, and the value of a particular critical dimension 
for each part is determined. The comparative boxplot at 
the bottom of this page is constructed from the resulting 
data. Compare and contrast the two samples. 


Comparative boxplot for Exercise 58 
Machine 


59. 


Dimension 


95 105 


115 


Blood cocaine concentration (mg/L) was determined 
both for a sample of individuals who had died from 
cocaine-induced excited delirium (ED) and for a sample 
of those who had died from a cocaine overdose without 
excited delirium; survival time for people in both 
groups was at most 6 hours. The accompanying data 
was read from a comparative boxplot in the article 
“Fatal Excited Delirium Following Cocaine Use” (J. 
of Forensic Sciences, 1997: 25-31). 


ED 0 0 0 0 1 1 12 1 2 2 33 
3 4 #5 7 8 1015 2.7 28 
3.5 40 8.9 9.2 11.7 21.0 
Non-ED 0 0 O O O J LL 2 2.2 
3.3 3 4 5 5S 6 8.9 1.0 


12 14 15 17 2.0 3.2 3.5 4.1 
43 48 5.0 56 5.9 60 64 7.9 
8.3 8.7 9.1 9.6 9.9 11.0 11.5 
12.2 12.7 14.0 16.6 17.8 


a. Determine the medians, fourths, and fourth spreads 
for the two samples. 

b. Are there any outliers in either sample? Any extreme 
outliers? 

c. Construct a comparative boxplot, and use it as a 
basis for comparing and contrasting the ED and 
non-ED samples. 


60. Observations on burst strength (lb/in?) were obtained 
both for test nozzle closure welds and for production 
cannister nozzle welds (“Proper Procedures Are the 
Key to Welding Radioactive Waste Cannisters,” 
Welding J., Aug. 1997: 61-67). 


Test 7200 6100 7300 7300 8000 7400 
7300 7300 8000 6700 8300 

Cannister 5250 5625 5900 5900 5700 6050 
5800 6000 5875 6100 5850 6600 


Construct a comparative boxplot and comment on inter- 
esting features (the cited article did not include such a 


61. 
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picture, but the authors commented that they had looked 
at one). 


The accompanying comparative boxplot of gasoline 
vapor coefficients for vehicles in Detroit appeared in the 
article “Receptor Modeling Approach to VOC 
Emission Inventory Validation” (J. of Envir. Engr., 
1995: 483-490). Discuss any interesting features. 


Comparative boxplot for Exercise 61 


Gas vapor coefficient 


Time 


6 am. 8am. 12 noon 2 p.m. 10 pm. 


SUPPLEMENTARY EXERCISES (62-83) 


55.3 55.3 55.3 55.9 55.9 55.9 55.9 56.1 56.1 56.1 56.1 
56.1 56.1 56.8 56.8 57.0 57.0 57.0 57.8 57.8 57.8 57.9 
57.9 57.9 58.8 58.8 58.8 59.8 59.8 59.8 62.2 62.2 63.8 
63.8 63.8 63.9 63.9 63.9 64.7 64.7 64.7 65.1 65.1 65.1 
65.3 65.3 65.3 65.3 67.4 67.4 67.4 67.4 68.7 68.7 68.7 
68.7 69.0 70.4 70.4 71.2 71.2 71.2 73.0 73.0 73.1 73.1 
74.6 74.6 74.6 74.6 79.3 79.3 79.3 79.3 83.0 83.0 83.0 


62. Consider the following information on ultimate tensile 
strength (Ib/in) for a sample of n = 4 hard zirconium cop- 
per wire specimens (from “‘Characterization Methods for 
Fine Copper Wire,” Wire J. Intl., Aug., 1997: 74-80): 


x = 76,831 s = 180 
largest x; = 77,048 


smallest x; = 76,683 


Determine the values of the two middle sample observa- 
tions (and don’t do it by successive guessing!). 


63. A sample of 77 individuals working at a particular office 
was selected and the noise level (dBA) experienced by 
each individual was determined, yielding the following 
data (‘Acceptable Noise Levels for Construction Site 
Offices,” Building Serv. Engr. Research and Technology, 
2009: 87-94). 


64. 


Use various techniques discussed in this chapter to 
organize, summarize, and describe the data. 


Fretting is a wear process that results from tangential 
oscillatory movements of small amplitude in machine 
parts. The article “Grease Effect on Fretting Wear of 
Mild Steel’ (industrial Lubrication and Tribology, 
2008: 67-78) included the following data on volume 
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wear (10°-4mm+?) for base oils having four different 
viscosities. 


Viscosity Wear 


20.4 58.8 30.8 27.3 29.9 17.7 76.5 
30.2 44.5 47.1 48.7 41.6 32.8 18.3 
89.4 73.3 57.1 66.0 93.8 133.2 81.1 
252.6 30.6 24.2 16.6 38.9 28.7 23.6 


a. The sample coefficient of variation 100s/x assesses 
the extent of variability relative to the mean (specifi- 
cally, the standard deviation as a percentage of the 
mean). Calculate the coefficient of variation for the 
sample at each viscosity. Then compare the results 
and comment. 

b. Construct a comparative boxplot of the data and 
comment on interesting features. 


65. The accompanying frequency distribution of fracture 
strength (MPa) observations for ceramic bars fired in a 
particular kiln appeared in the article ‘Evaluating Tunnel 
Kiln Performance” (Amer. Ceramic Soc. Bull., Aug. 


1997: 59-63). 
Class 81—<83 83—<85 85—<87 87—<89 89-<91 
Frequency 6 7 17 30 43 
Class 91-<93  93-<95 95-<97 97-—<99 
Frequency 28 22 13 3 


a. Construct a histogram based on relative frequencies, 
and comment on any interesting features. 

b. What proportion of the strength observations are at 
least 85? Less than 95? 

c. Roughly what proportion of the observations are less 
than 90? 


66. A deficiency of the trace element selenium in the diet 
can negatively impact growth, immunity, muscle and 
neuromuscular function, and fertility. The introduc- 
tion of selenium supplements to dairy cows is justified 
when pastures have low selenium levels. Authors of 
the article “Effects of Short-Term Supplementation 
with Selenised Yeast on Milk Production and 
Composition of Lactating Cows” (Australian J. of 
Dairy Tech., 2004: 199-203) supplied the following 
data on milk selenium concentration (mg/L) for a 
sample of cows given a selenium supplement and a 
control sample given no supplement, both initially and 
after a 9-day period. 


Obs Init Se Init Cont ‘Final Se Final Cont 
1 11.4 9.1 138.3 9.3 
2 9.6 8.7 104.0 8.8 
3 10.1 9.7 96.4 8.8 
4 8.5 10.8 89.0 10.1 
5 10.3 10.9 88.0 9.6 
6 10.6 10.6 103.8 8.6 
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Obs Init Se Init Cont Final Se Final Cont 
7 11.8 10.1 147.3 10.4 
8 9.8 12.3 97.1 12.4 
9 10.9 8.8 172.6 9.3 
10 10.3 10.4 146.3 9.5 
11 10.2 10.9 99.0 8.4 
12 11.4 10.4 122.3 8.7 
13 9.2 11.6 103.0 12.5 
14 10.6 10.9 117.8 9.1 
15 10.8 121.5 
16 8.2 93.0 


a. Do the initial Se concentrations for the supplement 
and control samples appear to be similar? Use vari- 
ous techniques from this chapter to summarize the 
data and answer the question posed. 

b. Again use methods from this chapter to summarize 
the data and then describe how the final Se concen- 
tration values in the treatment group differ from 
those in the control group. 


67. Aortic stenosis refers to a narrowing of the aortic valve 
in the heart. The article “Correlation Analysis of 
Stenotic Aortic Valve Flow Patterns Using Phase 
Contrast MRI” (Annals of Biomed. Engr., 2005: 
878-887) gave the following data on aortic root diameter 
(cm) and gender for a sample of patients having various 
degrees of aortic stenosis: 


M: 3.7 34 3.7 40 3.9 38 34 3.6 3.1 4.0 3.4 3.8 3.5 
F: 3.8 2.6 3.2 3.0 43 3.5 3.1 3.1 3.2 3.0 


a. Compare and contrast the diameter observations for 
the two genders. 

b. Calculate a 10% trimmed mean for each of the two 
samples, and compare to other measures of center 
(for the male sample, the interpolation method men- 
tioned in Section 1.3 must be used). 


68. a. For what value of c is the quantity (x; — c)? mini- 
mized? [Hint: Take the derivative with respect to c, 
set equal to 0, and solve.] 

b. Using the result of part (a), which of the two quanti- 
ties X(x,; — x)? and =(x,; — pw)? will be smaller than 
the other (assuming that x ~ 2)? 


69. a. Let a and b be constants and let y, = ax; + b for 
i= 1,2,...,n. What are the relationships between x 
and y and between s? and s2? 

b. A sample of temperatures for initiating a certain 
chemical reaction yielded a sample average (°C) of 
87.3 and a sample standard deviation of 1.04. What 
are the sample average and standard deviation mea- 
sured in °F? [Hint: F = 2C + 32.] 


70. Elevated energy consumption during exercise continues 
after the workout ends. Because calories burned after 
exercise contribute to weight loss and have other conse- 
quences, it is important to understand this process. The 
article “Effect of Weight Training Exercise and 


71. 


Variable N 


Treadmill Exercise on Post-Exercise Oxygen 
Consumption” (Medicine and Science in Sports and 
Exercise, 1998: 518-522) reported the accompanying 
data from a study in which oxygen consumption (liters) 
was measured continuously for 30 minutes for each of 15 
subjects both after a weight training exercise and after a 
treadmill exercise. 


Subject i 2 3 4 5 6 7 
Weight (x) 14.6 14.4 19.5 24.3 163 22.1 23.0 
Treadmill (y) 11.3. 5.3 9.1 15.2 10.1 19.6 20.8 
Subject 8 9 10 It 12 13 14 15 
Weight (x) 18.7 19.0 17.0 19.1 19.6 23.2 18.5 15.9 


Treadmill (y) 10.3 10.3 2.6 16.6 22.4 23.6 12.6 4.4 


a. Construct a comparative boxplot of the weight and 
treadmill observations, and comment on what you 
see. 

b. The data is in the form of (x, y) pairs, with x and y 
measurements on the same variable under two differ- 
ent conditions, so it is natural to focus on the differ- 
ences within pairs: d) =x, — y,,...,d, =X, — Y,- 
Construct a boxplot of the sample differences. What 
does it suggest? 


Here is a description from Minitab of the strength data 
given in Exercise 13. 


Mean Median TrMean StDev SE Mean 


strength 153 135.39 135.40 135.41 4.59 Oe 37 
Variable Minimum Maximum Q1 Q3 
strength 122.20 147.70 132.95 138.25 


72. 


a. Comment on any interesting features (the quartiles 
and fourths are virtually identical here). 

b. Construct a boxplot of the data based on the quar- 
tiles, and comment on what you see. 


Anxiety disorders and symptoms can often be effec- 
tively treated with benzodiazepine medications. It is 
known that animals exposed to stress exhibit a 
decrease in benzodiazepine receptor binding in the 
frontal cortex. The article “Decreased Benzodiazepine 
Receptor Binding in Prefrontal Cortex in Combat- 
Related Posttraumatic Stress Disorder” (Amer. J. 
of Psychiatry, 2000: 1120-1126) described the first 
study of benzodiazepine receptor binding in individu- 
als suffering from PTSD. The accompanying data on a 
receptor binding measure (adjusted distribution vol- 
ume) was read from a graph in the article. 


PTSD: 10, 20, 25, 28, 31, 35, 37, 38, 38, 39, 39, 
42, 46 
Healthy: 23, 39, 40, 41, 43, 47, 51, 58, 63, 66, 67, 
69, 72 


Use various methods from this chapter to describe and 
summarize the data. 


73. 


74. 


75. 


76. 


77. 
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The article ‘““Can We Really Walk Straight?” (Amer. J. 
of Physical Anthropology, 1992: 19-27) reported on an 
experiment in which each of 20 healthy men was asked 
to walk as straight as possible to a target 60 m away at 
normal speed. Consider the following observations on 
cadence (number of strides per second): 


95 85 92 95 93.86 
78 93 93 1.05 .93 1.06 


100 92 85 81 
1.06 .96 .81 .96 


Use the methods developed in this chapter to summarize 
the data; include an interpretation or discussion wherever 
appropriate. [Note: The author of the article used a rather 
sophisticated statistical analysis to conclude that people 
cannot walk in a straight line and suggested several 
explanations for this.] 


The mode of a numerical data set is the value that 

occurs most frequently in the set. 

a. Determine the mode for the cadence data given in 
Exercise 73. 

b. For a categorical sample, how would you define the 
modal category? 


Specimens of three different types of rope wire were 
selected, and the fatigue limit (MPa) was determined for 
each specimen, resulting in the accompanying data. 


Typel 350 350 350 358 370 370 370 371 
371 372 372 384 391 391 392 

Type2 350 354 359 363 365 368 369 371 
373 374 376 380 383 388 392 

Type 3 350 361 362 364 364 365 366 371 
377 377 377 «379 =380 «380 6392 


a. Construct a comparative boxplot, and comment on 
similarities and differences. 

b. Construct a comparative dotplot (a dotplot for each 
sample with a common scale). Comment on similar- 
ities and differences. 

c. Does the comparative boxplot of part (a) give an 
informative assessment of similarities and differ- 
ences? Explain your reasoning. 


The three measures of center introduced in this chapter are 
the mean, median, and trimmed mean. Two additional 
measures of center that are occasionally used are the mid- 
range, which is the average of the smallest and largest 
observations, and the midfourth, which is the average of 
the two fourths. Which of these five measures of center are 
resistant to the effects of outliers and which are not? 
Explain your reasoning. 


The authors of the article ““Predictive Model for Pitting 

Corrosion in Buried Oil and Gas Pipelines” 

(Corrosion, 2009: 332-342) provided the data on which 

their investigation was based. 

a. Consider the following sample of 61 observations on 
maximum pitting depth (mm) of pipeline specimens 
buried in clay loam soil. 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


50 CHAPTER 1 Overview and Descriptive Statistics 


0.41 0.41 0.41 0.41 043 043 0.43 0.48 0.48 
0.58 0.79 0.79 0.81 0.81 0.81 0.91 0.94 0.94 
1.02 1.04 1.04 1.17 1.17) 1.17) 1.17 1.17 1.17 
1.17 1.19 1.19 1.27 140 140 1.59 1.59 1.60 
1.68 1.91 1.96 1.96 1.96 2.10 2.21 2.31 2.46 
2.49 2.57 2.74 3.10 3.18 3.30 3.58 3.58 4.15 
4.75 5.33 7.65 7.70 8.13 10.41 13.44 


Construct a stem-and-leaf display in which the two 
largest values are shown in a last row labeled HI. 

b. Refer back to (a), and create a histogram based on 
eight classes with O as the lower limit of the first 
class and class widths of .5, .5, .5, .5, 1, 2, 5, and 5, 
respectively. 

c. The accompanying comparative boxplot from 
Minitab shows plots of pitting depth for four differ- 
ent types of soils. Describe its important features. 


78. Consider a sample x), x ,...,x, and suppose that the 

values of x, s?, and s have been calculated. 

a. Let y, =x, — x fori = 1,...,. How do the values of 
s’ and s for the y,’s compare to the corresponding 
values for the x,’s? Explain. 

b. Let z,=(x;—x)/s for i=1,...,n. What are the 
values of the sample variance and sample standard 
deviation for the z;s? 


79, Let x, and s? denote the sample mean and variance for 
the sample x,,...,.x,, and let x,,, and s?,, denote these 
quantities when an additional observation x,,,, is added 
to the sample. 

a. Show how x, ,, can be computed from x, and x, ,,. 


b. Show that 


n 
n+1 


NS +1 = (n 1s; y+ X,)° 


so that s2,, can be computed from x,,,;, x, and s2. 

c. Suppose that a sample of 15 strands of drapery yarn 
has resulted in a sample mean thread elongation of 
12.58 mm and a sample standard deviation of .512 


mm. A 16" strand results in an elongation value of 


Comparative boxplot for Exercise 77 


14 
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11.8. What are the values of the sample mean and 
sample standard deviation for all 16 elongation 
observations? 


80. Lengths of bus routes for any particular transit system 
will typically vary from one route to another. The article 
“Planning of City Bus Routes” (J. of the Institution of 
Engineers, 1995: 211-215) gives the following informa- 
tion on lengths (km) for one particular system: 


Length 6—-<8 8—<10 10—<12 12—<14 14-<16 
Frequency 6 23 30 35 32 
Length 16—<18 18—<20 20—<22 22—<24 24—<26 
Frequency 48 42 40 28 27 
Length 26—<28 28-<30 30—<35 35—<40 40—<45 
Frequency 26 14 27 11 2 


a. Draw a histogram corresponding to these frequen- 
cies. 

b. What proportion of these route lengths are less than 
20? What proportion of these routes have lengths of at 
least 30? 

c. Roughly what is the value of the 90" percentile of 
the route length distribution? 

d. Roughly what is the median route length? 


81. A study carried out to investigate the distribution of total 
braking time (reaction time plus accelerator-to-brake move- 
ment time, in ms) during real driving conditions at 60 km/ 
hr gave the following summary information on the distribu- 
tion of times (“A Field Study on Braking Responses 
During Driving,” Ergonomics, 1995: 1903-1910): 


mean = 535 median = 500 mode = 500 
sd = 96 minimum = 220 maximum = 925 
5th percentile = 400 10th percentile = 430 
90th percentile = 640 95th percentile = 720 


What can you conclude about the shape of a histogram 
of this data? Explain your reasoning. 


82. The sample data x), x5,...,x 


,x, Sometimes represents a 
time series, where x, = the observed value of a response 
variable x at time f. Often the observed series shows a great 
deal of random variation, which makes it difficult to study 
longer-term behavior. In such situations, it is desirable to 
produce a smoothed version of the series. One technique 
for doing so involves exponential smoothing. The 
value of a smoothing constant a is chosen (0 <a < 1). 
Then with x, = smoothed value at time ¢, we set x, = x, 
and for t = 2, 3,...,, x, = ax,+ (1 — a)x,_). 

a. Consider the following time series in which 
x, = temperature (°F) of effluent at a sewage treat- 
ment plant on day t: 47, 54, 53, 50, 46, 46, 47, 50, 
51, 50, 46, 52, 50, 50. Plot each x, against t on a 
two-dimensional coordinate system (a time-series 
plot). Does there appear to be any pattern? 

b. Calculate the x,s using a = .1. Repeat using a = .5. 
Which value of a gives a smoother x, series? 


c. Substitute x, , = ax,_, + (1 — a@)x,_, on the right- 
hand side of the expression for x,, then substitute x,_, 
in terms of x,_, and x,_,, and so on. On how many of 
the values x,,x,_,,...,x,; does x, depend? What hap- 
pens to the coefficient on x,_, as k increases? 

d. Refer to part (c). If ¢ is large, how sensitive is x, to 
the initialization x, = x,? Explain. 


[Note: A relevant reference is the article ‘Simple 
Statistics for Interpreting Environmental Data,” 
Water Pollution Control Fed. J., 1981: 167-175.] 


83. Consider numerical observations x,,..., x,,. It is frequently 
of interest to know whether the x; s are (at least approxi- 
mately) symmetrically distributed about some value. If n is 
at least moderately large, the extent of symmetry can be 
assessed from a stem-and-leaf display or histogram. 
However, if n is not very large, such pictures are not par- 
ticularly informative. Consider the following alternative. 
Let y, denote the smallest x,, y, the second smallest x;, and 
so on. Then plot the following pairs as points on a 
two-dimensional coordinate system: (y, — x, xX — yj), 
(¥,-1 — ¥,¥ — yy), O,-2 — ¥, ¥ — yy),... There are n/2 
points when n is even and (n — 1)/2 when n is odd. 

a. What does this plot look like when there is perfect 
symmetry in the data? What does it look like when 
observations stretch out more above the median than 
below it (a long upper tail)? 


BIBLIOGRAPHY 


Albert, Jim and Maria Rizzo, R by Example, Springer, New 
York, 2012. An up-to-date introduction whose focus is on 
applying statistical techniques rather than on details of the R 
programming language. 

Chambers, John, William Cleveland, Beat Kleiner, and Paul 
Tukey, Graphical Methods for Data Analysis, Brooks/ 
Cole, Pacific Grove, CA, 1983. A highly recommended 
presentation of various graphical and pictorial methodology 
in statistics. 

Cleveland, William, Visualizing Data, Hobart Press, Summit, 
NJ, 1993. An entertaining tour of pictorial techniques. 

Freedman, David, Robert Pisani, and Roger Purves, Statistics 
(4th ed.), Norton, New York, 2007. An excellent, very 
nonmathematical survey of basic statistical reasoning and 
methodology. 

Hoaglin, David, Frederick Mosteller, and John Tukey, 
Understanding Robust and Exploratory Data Analysis, 
Wiley, New York, 1983. Discusses why, as well as how, 


Bibliography 51 
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703.4 978.0 1656.0 1697.8 2745.6 


84. Consider a sample x,, ... , x,, with n even. Let x, and x, 
denote the average of the smallest n/2 and the largest n/2 
observations, respectively. Show that the mean absolute 
deviation from the median for this sample satisfies 


Lx, -— ¥|/n = &y -— X)—/2 


Then show that if n is odd and the two averages are 
calculated after excluding the median from each half, 
replacing n on the left with n — 1 gives the correct result. 
[Hint: Break the sum into two parts, the first involving 
observations less than or equal to the median and the 
second involving observations greater than or equal to 
the median.] 
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Probability 


INTRODUCTION 


The term probability refers to the study of randomness and uncertainty. In any sit- 
uation in which one of a number of possible outcomes may occur, the discipline of 
probability provides methods for quantifying the chances, or likelihoods, associated 
with the various outcomes. The language of probability is constantly used in an 
informal manner in both written and spoken contexts. Examples include such state- 
ments as “It is likely that the Dow Jones average will increase by the end of the 
year,” “There is a 50-50 chance that the incumbent will seek reelection,” “There 
will probably be at least one section of that course offered next year,” “The odds 
favor a quick settlement of the strike,” and “It is expected that at least 20,000 con- 
cert tickets will be sold.” In this chapter, we introduce some elementary probability 
concepts, indicate how probabilities can be interpreted, and show how the rules of 
probability can be applied to compute the probabilities of many interesting events. 
The methodology of probability will then permit us to express in precise language 
such informal statements as those given above. 

The study of probability as a branch of mathematics goes back over 300 years, 
where it had its genesis in connection with questions involving games of chance. 
Many books are devoted exclusively to probability, but our objective here is to cover 
only that part of the subject that has the most direct bearing on problems of statisti- 
cal inference. 


52 
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2.1 Sample Spaces and Events 


An experiment is any activity or process whose outcome is subject to uncertainty. 
Although the word experiment generally suggests a planned or carefully controlled 
laboratory testing situation, we use it here in a much wider sense. Thus experiments 
that may be of interest include tossing a coin once or several times, selecting a card 
or cards from a deck, weighing a loaf of bread, ascertaining the commuting time 
from home to work on a particular morning, obtaining blood types from a group of 
individuals, or measuring the compressive strengths of different steel beams. 


The Sample Space of an Experiment 


DEFINITION The sample space of an experiment, denoted by ¥ is the set of all possible 
outcomes of that experiment. 


EXAMPLE 2.1 The simplest experiment to which probability applies is one with two possible out- 
comes. One such experiment consists of examining a single weld to see whether it is 
defective. The sample space for this experiment can be abbreviated as £ = {N, D}, 
where N represents not defective, D represents defective, and the braces are used to 
enclose the elements of a set. Another such experiment would involve tossing a 
thumbtack and noting whether it landed point up or point down, with sample space 
S§ = {U, D}, and yet another would consist of observing the gender of the next child 
born at the local hospital, with ¥ = {M, F}. a 


EXAMPLE 2.2 _ If we examine three welds in sequence and note the result of each examination, then 
an outcome for the entire experiment is any sequence of N’s and D’s of length 3, so 


§ = {NNN, NND, NDN, NDD, DNN, DND, DDN, DDD} 


If we had tossed a thumbtack three times, the sample space would be obtained by 
replacing N by U in § above, with a similar notational change yielding the sample space 
for the experiment in which the genders of three newborn children are observed. ia 


EXAMPLE 2.3 Two gas stations are located at a certain intersection. Each one has six gas pumps. Consider 
the experiment in which the number of pumps in use at a particular time of day is deter- 
mined for each of the stations. An experimental outcome specifies how many pumps are in 
use at the first station and how many are in use at the second one. One possible outcome is 
(2, 2), another is (4, 1), and yet another is (1, 4). The 49 outcomes in & are displayed in the 
accompanying table. The sample space for the experiment in which a six-sided die is thrown 
twice results from deleting the 0 row and 0 column from the table, giving 36 outcomes. 


Second Station 


0 1 2 3 z) 5 6 


(0, 0) (0, 1) (0, 2) (0, 3) (0, 4) (0, 5) (0, 6) 
(1, 0) (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) 
(2, 0) (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) 
(3, 0) (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) 
(4, 0) (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) 
(5, 0) (5, 1) (5;.2) (5, 3) (5, 4) (55.5) (5, 6) 
(6, 0) (6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6) 


First Station 


An kwWNeF S&S 
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EXAMPLE 2.4 A reasonably large percentage of C++ programs written at a particular company 
compile on the first run, but some do not (a compiler is a program that translates 
source code, in this case C++ programs, into machine language so programs can 
be executed). Suppose an experiment consists of selecting and compiling C+ + pro- 
grams at this location one by one until encountering a program that compiles on the 
first run. Denote a program that compiles on the first run by S (for success) and one 
that doesn’t do so by F (for failure). Although it may not be very likely, a possible 
outcome of this experiment is that the first 5 (or 10 or 20 or ...) are F’s and the next 
one is an S. That is, for any positive integer n, we may have to examine n programs 
before seeing the first $. The sample space is = {S, FS, FFS, FFFS,...}, which 
contains an infinite number of possible outcomes. The same abbreviated form of 
the sample space is appropriate for an experiment in which, starting at a specified 
time, the gender of each newborn infant is recorded until the birth of a male is 
observed. Oo 


Events 


In our study of probability, we will be interested not only in the individual outcomes 
of £ but also in various collections of outcomes from ¥. 


DEFINITION An event is any collection (subset) of outcomes contained in the sample space 
s. An event is simple if it consists of exactly one outcome and compound if 
it consists of more than one outcome. 


When an experiment is performed, a particular event A is said to occur if the result- 
ing experimental outcome is contained in A. In general, exactly one simple event will 
occur, but many compound events will occur simultaneously. 


EXAMPLE 2.5 Consider an experiment in which each of three vehicles taking a particular freeway 
exit turns left (L) or right (R) at the end of the exit ramp. The eight possible outcomes 
that comprise the sample space are LLL, RLL, LRL, LLR, LRR, RLR, RRL, and RRR. 
Thus there are eight simple events, among which are E, = {LLL} and E, = {LRR}. 
Some compound events include 


A = {RLL, LRL, LLR} = the event that exactly one of the three 
vehicles turns right 


B= {LLL, RLL, LRL, LLR} = the event that at most one of the 
vehicles turns right 


C = {LLL, RRR} = the event that all three vehicles turn in the 
same direction 


Suppose that when the experiment is performed, the outcome is LLL. Then the sim- 
ple event E, has occurred and so also have the events B and C (but not A). | 


EXAMPLE 2.6 When the number of pumps in use at each of two six-pump gas stations is observed, 
(Example 2.3 there are 49 possible outcomes, so there are 49 simple events: E, = {(0, 0)}, E, = 
continued) {(O, 1)},..., Eyy = {(6, 6)}. Examples of compound events are 
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A = {(0, 0), C1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)} = the event that 
the number of pumps in use is the same for both stations 
B= {(0, 4), C1, 3), (2, 2), (3, 1), (4, 0)} = the event that 
the total number of pumps in use is four 
C = {(0, 0), (0, 1), 1, 0), (1, 1)} = the event that 
at most one pump is in use at each station a 


EXAMPLE 2.7 The sample space for the program compilation experiment contains an infinite num- 
(Example 2.4 _ ber of outcomes, so there are an infinite number of simple events. Compound events 
continued) include 


A = {S, FS, FFS} = the event that at most three programs are examined 


E = {FS, FFFS, FFFFFS,...} = the event that an even number of 
programs are examined a 


Some Relations from Set Theory 


An event is just a set, so relationships and results from elementary set theory can be 
used to study events. The following operations will be used to create new events 
from given events. 


DEFINITION 1. The complement of an event A, denoted by A’, is the set of all outcomes in 
§ that are not contained in A. 

2. The union of two events A and B, denoted by A U B and read “A or B,” is 
the event consisting of all outcomes that are either in A or in B or in both 
events (so that the union includes outcomes for which both A and B occur 
as well as outcomes for which exactly one occurs)—that is, all outcomes in 
at least one of the events. 


3. The intersection of two events A and B, denoted by A M B and read “A and 
B,” is the event consisting of all outcomes that are in both A and B. 


EXAMPLE 2.8 For the experiment in which the number of pumps in use at a single six-pump gas 
(Example 2.3 station is observed, let A = {0, 1, 2,3,4}, B= {3,4,5,6}, and C = {1,3, 5}. 


continued) Then 
A’ = {5,6}, AUB= {0,1,2,3,4,5,6}=§ AUC = {0, 1, 2, 3,4, 5}, 
AN B= {3,4}, ANC= {1,3}, (ANO)' = {0, 2, 4,5, 6} | 


EXAMPLE 2.9 — In the program compilation experiment, define A, B, and C by 
(Example 2.4 


: A = {S, FS, FFS}, B= {S, FFS, FFFFS}, C = {FS, FFFS, FFFFFS,...} 
continued) 


Then 


A’ = {FFFS, FFFFS, FFFFFS,...}, C' = {S, FFS, FFFFS,...} 
AUB ={S, FS, FFS, FFFFS}, AB = {S, FFS} a 
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Sometimes A and B have no outcomes in common, so that the intersection of 
A and B contains no outcomes. 


DEFINITION Let © denote the null event (the event consisting of no outcomes whatsoever). 
When A M B = @, A and B are said to be mutually exclusive or disjoint events. 


EXAMPLE 2.10 A small city has three automobile dealerships: a GM dealer selling Chevrolets and 
Buicks; a Ford dealer selling Fords and Lincolns; and a Toyota dealer. If an experi- 
ment consists of observing the brand of the next car sold, then the events 
A = {Chevrolet, Buick} and B = {Ford, Lincoln} are mutually exclusive because 
the next car sold cannot be both a GM product and a Ford product (at least until the 
two companies merge!). o 


The operations of union and intersection can be extended to more than two events. 
For any three events A, B, and C, the event A U B U Cis the set of outcomes contained 
in at least one of the three events, whereas A M B / C is the set of outcomes contained 
in all three events. Given events A,, A,, A;,..., these events are said to be mutually exclu- 
sive (or pairwise disjoint) if no two events have any outcomes in common. 

A pictorial representation of events and manipulations with events is obtained by 
using Venn diagrams. To construct a Venn diagram, draw a rectangle whose interior will 
represent the sample space £. Then any event A is represented as the interior of a closed 
curve (often a circle) contained in £. Figure 2.1 shows examples of Venn diagrams. 


‘ap? |l-qp'||‘ea’ | IB O oO 


& & Sf & 
(a) Venn diagram of (b) Shaded region (c) Shaded region (d) Shaded region (e) Mutually exclusive 
events A and B isANB isAUB is A’ events 
; (dy) : ; (dy) : 
£ . ig . 
(f) Shaded region (g) Shaded region 
isAUBUC isANBNC 


Figure 2.1 Venn diagrams 


EXERCISES Section 2.1 (1-10) 


1. Four universities—1, 2, 3, and 4—are participating in a c. Let B denote the event that 2 gets into the champion- 
holiday basketball tournament. In the first round, 1 will play ship game. List outcomes in B. 
2 and 3 will play 4. Then the two winners will play for the d. What are the outcomes in A UB and in AM B? 
championship, and the two losers will also play. One possi- What are the outcomes in A’? 


ble outcome can be denoted by 1324 (1 beats 2 and 3 beats 


2. that vehicles taki ticular f it 
4 in first-round games, and then | beats 3 and 2 beats 4). UBIO Net VEU OLS RINE p anHoular MeN ee 


can turn right (R), turn left (ZL), or go straight (5S). 
Consider observing the direction for each of three 
successive vehicles. 


a. List all outcomes in &. 
b. Let A denote the event that 1 wins the tournament. 
List outcomes in A. 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


a. List all outcomes in the event A that all three vehicles 
go in the same direction. 

b. List all outcomes in the event B that all three vehicles 
take different directions. 

c. List all outcomes in the event C that exactly two of 
the three vehicles turn right. 

d. List all outcomes in the event D that exactly two 
vehicles go in the same direction. 

e. List outcomes in D’, CU D, and CN D. 


Three components are connected to form a system as 
shown in the accompanying diagram. Because the compo- 
nents in the 2-3 subsystem are connected in parallel, that 
subsystem will function if at least one of the two individual 
components functions. For the entire system to function, 
component | must function and so must the 2-3 subsystem. 


The experiment consists of determining the condition of 

each component [S (success) for a functioning compo- 

nent and F (failure) for a nonfunctioning component]. 

a. Which outcomes are contained in the event A that 
exactly two out of the three components function? 

b. Which outcomes are contained in the event B that at 
least two of the components function? 

c. Which outcomes are contained in the event C that the 
system functions? 

d. List outcomes in C’, AUC, ANC, BUC, and 
BNC. 


Each of a sample of four home mortgages is classified as 

fixed rate (F) or variable rate (V). 

a. What are the 16 outcomes in £? 

b. Which outcomes are in the event that exactly three of 
the selected mortgages are fixed rate? 

ce. Which outcomes are in the event that all four mort- 
gages are of the same type? 

d. Which outcomes are in the event that at most one of 
the four is a variable-rate mortgage? 

e. What is the union of the events in parts (c) and (d), 
and what is the intersection of these two events? 

f. What are the union and intersection of the two events 
in parts (b) and (c)? 


A family consisting of three persons—A, B, and C—goes 
to a medical clinic that always has a doctor at each of 
stations 1, 2, and 3. During a certain week, each member 
of the family visits the clinic once and is assigned at 
random to a station. The experiment consists of recording 
the station number for each member. One outcome is (1, 
2, 1) for A to station 1, B to station 2, and C to station 1. 
a. List the 27 outcomes in the sample space. 

b. List all outcomes in the event that all three members 

go to the same station. 


6. 


10. 
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c. List all outcomes in the event that all members go to 
different stations. 

d. List all outcomes in the event that no one goes to 
station 2. 


A college library has five copies of a certain text on 

reserve. Two copies (1 and 2) are first printings, and the 

other three (3, 4, and 5) are second printings. A student 

examines these books in random order, stopping only 

when a second printing has been selected. One possible 

outcome is 5, and another is 213. 

a. List the outcomes in /. 

b. Let A denote the event that exactly one book must be 
examined. What outcomes are in A? 

c. Let B be the event that book 5 is the one selected. 
What outcomes are in B? 

d. Let C be the event that book 1 is not examined. What 
outcomes are in C? 


An academic department has just completed voting by 

secret ballot for a department head. The ballot box con- 

tains four slips with votes for candidate A and three slips 

with votes for candidate B. Suppose these slips are 

removed from the box one by one. 

a. List all possible outcomes. 

b. Suppose a running tally is kept as slips are removed. 
For what outcomes does A remain ahead of B 
throughout the tally? 


An engineering construction firm is currently working 
on power plants at three different sites. Let A; denote 
the event that the plant at site i is completed by the 
contract date. Use the operations of union, intersec- 
tion, and complementation to describe each of the 
following events in terms of A,, A,, and A;, draw a 
Venn diagram, and shade the region corresponding to 
each one. 

a. At least one plant is completed by the contract 
date. 

All plants are completed by the contract date. 


Only the plant at site 1 is completed by the contract date. 
Exactly one plant is completed by the contract date. 
Either the plant at site 1 or both of the other two 
plants are completed by the contract date. 


canes 


Use Venn diagrams to verify the following two relationships 
for any events A and B (these are called De Morgan’s laws): 
a. (AU B)'=A'NB' 
b. (AN B)' =A’ UB 


(Hint: In each part, draw a diagram corresponding to the 
left side and another corresponding to the right side.] 


a. In Example 2.10, identify three events that are mutu- 
ally exclusive. 

b. Suppose there is no outcome common to all three of 
the events A, B, and C. Are these three events neces- 
sarily mutually exclusive? If your answer is yes, 
explain why; if your answer is no, give a counterex- 
ample using the experiment of Example 2.10. 
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2.2 Axioms, Interpretations, 


and Properties of Probability 


Given an experiment and a sample space &, the objective of probability is to assign 
to each event A a number P(A), called the probability of the event A, which will give 
a precise measure of the chance that A will occur. To ensure that the probability 
assignments will be consistent with our intuitive notions of probability, all assign- 
ments should satisfy the following axioms (basic properties) of probability. 


AXIOM 1 For any event A, P(A) = 0. 
AXIOM 2 P(§) = 1. 
AXIOM 3 If A,, A,, A;,... is an infinite collection of disjoint events, then 


P(A, UA, UA, Us ) = DPA) 


i=1 


You might wonder why the third axiom contains no reference to a finite collection 
of disjoint events. It is because the corresponding property for a finite collection can be 
derived from our three axioms. We want the axiom list to be as short as possible and 
not contain any property that can be derived from others on the list. Axiom 1 reflects 
the intuitive notion that the chance of A occurring should be nonnegative. The sample 
space is by definition the event that must occur when the experiment is performed (£ 
contains all possible outcomes), so Axiom 2 says that the maximum possible prob- 
ability of 1 is assigned to £. The third axiom formalizes the idea that if we wish the 
probability that at least one of a number of events will occur and no two of the events 
can occur simultaneously, then the chance of at least one occurring is the sum of the 
chances of the individual events. 


PROPOSITION P(©) = 0 where @ is the null event (the event containing no outcomes what- 
soever). This in turn implies that the property contained in Axiom 3 is valid 
for a finite collection of disjoint events. 


Proof First consider the infinite collection A, = 0, A, = 0, A; = Y,.... Since 
® 1@ =@, the events in this collection are disjoint and U A; = @. The third 
axiom then gives 


P(O) = XP(O) 


This can happen only if P(@) = 0. 
Now suppose that A,, A,,..., A, are disjoint events, and append to these the infi- 
nite collection A,.., = ©, Aya. = O, Ay.3 = Y,.... Again invoking the third axiom, 


rf U A) : rf e A) = SPA) = yr) 


i=l i=1 i=1 
as desired. H 
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EXAMPLE 2.11 Consider tossing a thumbtack in the air. When it comes to rest on the ground, either 
its point will be up (the outcome U) or down (the outcome D). The sample space for 
this event is therefore = {U, D}. The axioms specify P(£) = 1, so the probability 
assignment will be completed by determining P(U) and P(D). Since U and D are 
disjoint and their union is ¥, the foregoing proposition implies that 


1 = P(S) = P(U) + PO) 


It follows that P(D) = 1 — P(U). One possible assignment of probabilities is 
P(U) = .5, P(D) = .5, whereas another possible assignment is P(U) = .75, 
P(D) = .25. In fact, letting p represent any fixed number between 0 and 1, P(U) = p, 
P(D) = | — pis an assignment consistent with the axioms. EH 


EXAMPLE 2.12 Consider testing batteries coming off an assembly line one by one until one having 
a voltage within prescribed limits is found. The simple events are E, = {S}, 
E, = {FS}, E, = {FFS}, E, = {FFFS},.... Suppose the probability of any parti- 
cular battery being satisfactory is .99. Then it can be shown that P(E,) = .99, 
P(E;) = (.01)(.99), P(E;) = (.01)(.99),...is an assignment of probabilities to the 
simple events that satisfies the axioms. In particular, because the E,’s are disjoint 
and §= E, UE, UE, U..., it must be the case that 


1 = P(f) = P(E,) + P(E,) + P(E;) + + 
= SO +01 + (Oly + COL? +] 
Here we have used the formula for the sum of a geometric series: 


atart ar + an +o =—— 
=f 

However, another legitimate (according to the axioms) probability assignment 
of the same “geometric” type is obtained by replacing .99 by any other number p 
between 0 and | (and .01 by | — p). | 


Interpreting Probability 


Examples 2.11 and 2.12 show that the axioms do not completely determine an 
assignment of probabilities to events. The axioms serve only to rule out assignments 
inconsistent with our intuitive notions of probability. In the tack-tossing experiment 
of Example 2.11, two particular assignments were suggested. The appropriate or 
correct assignment depends on the nature of the thumbtack and also on one’s inter- 
pretation of probability. The interpretation that is most frequently used and most 
easily understood is based on the notion of relative frequencies. 

Consider an experiment that can be repeatedly performed in an identical and 
independent fashion, and let A be an event consisting of a fixed set of outcomes of 
the experiment. Simple examples of such repeatable experiments include the tack- 
tossing and die-tossing experiments previously discussed. If the experiment is per- 
formed n times, on some of the replications the event A will occur (the outcome will 
be in the set A), and on others, A will not occur. Let n(A) denote the number of repli- 
cations on which A does occur. Then the ratio n(A)/n is called the relative frequency 
of occurrence of the event A in the sequence of n replications. 

For example, let A be the event that a package sent within the state of California 
for 2"4 day delivery actually arrives within one day. The results from sending 10 
such packages (the first 10 replications) are as follows: 


Package # 1 2 3 + 5 6 7 8 9 10 
Did A occur? N Y Y Y N N Y ¥ N N 
Relative frequency ofA O 5 667 75 6 5 571.625. ~—.556 5 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


60 CHAPTER 2 Probability 


1.0 a 
z 3 
3 Relative _9 ~ 6% 27 
=I frequency 15 ° 
5 8 q 
a ae) 
3B & 
3 «6 s Approaches .6 
3 ed “= 
e 2 6 
5 3 
3 
3 4 & 
& Relative _ 5 _ 50 o 
g frequency 10 ° £ 
3 3) 
a 2 eS 

0 

0 10 20 30 40 50 0 100 200 300 400 500 600 700 800 900 1000 
Number of packages Number of packages 
(a) (b) 


Figure 2.2 Behavior of relative frequency (a) Initial fluctuation (b) Long-run stabilization 


Figure 2.2(a) shows how the relative frequency n(A)/n fluctuates rather substan- 
tially over the course of the first 50 replications. But as the number of replications 
continues to increase, Figure 2.2(b) illustrates how the relative frequency stabilizes. 

More generally, empirical evidence, based on the results of many such repeat- 
able experiments, indicates that any relative frequency of this sort will stabilize as 
the number of replications n increases. That is, as n gets arbitrarily large, n(A)/n 
approaches a limiting value referred to as the limiting (or long-run) relative frequency 
of the event A. The objective interpretation of probability identifies this limiting rela- 
tive frequency with P(A). Suppose that probabilities are assigned to events in accord- 
ance with their limiting relative frequencies. Then a statement such as “the probability 
of a package being delivered within one day of mailing is .6” means that of a large 
number of mailed packages, roughly 60% will arrive within one day. Similarly, if B is 
the event that an appliance of a particular type will need service while under warranty, 
then P(B) = .1 is interpreted to mean that in the long run 10% of such appliances will 
need warranty service. This doesn’t mean that exactly | out of 10 will need service, or 
that exactly 10 out of 100 will need service, because 10 and 100 are not the long run. 

This relative frequency interpretation of probability is said to be objective 
because it rests on a property of the experiment rather than on any particular indi- 
vidual concerned with the experiment. For example, two different observers of a 
sequence of coin tosses should both use the same probability assignments since the 
observers have nothing to do with limiting relative frequency. In practice, this inter- 
pretation is not as objective as it might seem, since the limiting relative frequency of 
an event will not be known. Thus we will have to assign probabilities based on our 
beliefs about the limiting relative frequency of events under study. Fortunately, there 
are many experiments for which there will be a consensus with respect to probability 
assignments. When we speak of a fair coin, we shall mean P(H) = P(T) = .5, anda 
fair die is one for which limiting relative frequencies of the six outcomes are all 1/6, 
suggesting probability assignments P({1}) = --- = P({6}) = 1/6. 

Because the objective interpretation of probability is based on the notion of 
limiting frequency, its applicability is limited to experimental situations that are 
repeatable. Yet the language of probability is often used in connection with situations 
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that are inherently unrepeatable. Examples include: “The chances are good for a 
peace agreement”; “It is likely that our company will be awarded the contract”; and 
“Because their best quarterback is injured, I expect them to score no more than 10 
points against us.” In such situations we would like, as before, to assign numerical 
probabilities to various outcomes and events (e.g., the probability is .9 that we will 
get the contract). This necessitates adopting an alternative interpretation of these 
probabilities. Because different observers may have different prior information and 
opinions concerning such experimental situations, probability assignments may now 
differ from individual to individual. Interpretations in such situations are thus referred 
to as subjective. The book by Robert Winkler listed in the chapter references gives a 
very readable survey of several subjective interpretations. 


More Probability Properties 


PROPOSITION For any event A, P(A) + P(A’) = 1, from which P(A) = 1 — P(A’). 


Proof In Axiom 3, let k = 2, A, = A, and A, =A’. Since by definition of A’, 
AUA' = 8 while A and A’ are disjoint, | = P(f) = P(A UA’) = P(A) + P(A’). OB 


This proposition is surprisingly useful because there are many situations in 
which P(A’) is more easily obtained by direct methods than is P(A). 


EXAMPLE 2.13 Consider a system of five identical components connected in series, as illustrated 
in Figure 2.3. 


Figure 2.3 A system of five components connected in a series 


Denote a component that fails by F and one that doesn’t fail by S (for success). Let A be 
the event that the system fails. For A to occur, at least one of the individual components 
must fail. Outcomes in A include SSFSS (1, 2, 4, and 5 all work, but 3 does not), FFSSS, 
and so on. There are in fact 31 different outcomes in A. However, A’, the event that the 
system works, consists of the single outcome SSSSS. We will see in Section 2.5 that if 
90% of all such components do not fail and different components fail independently 
of one another, then P(A’) = P(SSSSS) = .9° = .59. Thus P(A) = 1 — .59 = .41; so 
among a large number of such systems, roughly 41% will fail. a 


In general, the foregoing proposition is useful when the event of interest can be 
expressed as “at least ...,” since then the complement “less than ...” may be easier to 
work with (in some problems, “more than ...” is easier to deal with than “at most ...”). 
When you are having difficulty calculating P(A) directly, think of determining P(A’). 


PROPOSITION For any event A, P(A) = 1. 


This is because | = P(A) + P(A’) = P(A) since P(A’) = 0. 
When events A and B are mutually exclusive, P(A U B) = P(A) + P(B). 
For events that are not mutually exclusive, adding P(A) and P(B) results in 
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“double-counting” outcomes in the intersection. The next result, the addition 
rule for a double union probability, shows how to correct for this. 


PROPOSITION For any two events A and B, 
P(A U B) = P(A) + P(B) — P(AN B) 


Proof Note first that A U B can be decomposed into two disjoint events, A and 
BA’; the latter is the part of B that lies outside A (see Figure 2.4). Furthermore, 
B itself is the union of the two disjoint events AM B and A’ B, so P(B) = 
P(A M B) + P(A’ B). Thus 


P(A U B) = P(A) + P(BN A’) = P(A) + [P(B) — P(AN B)] 
= P(A) + P(B) — P(AN B) 


Figure 2.4 Representing A U B as a union of disjoint events i 


EXAMPLE 2.14 Ina certain residential suburb, 60% of all households get Internet service from the 
local cable company, 80% get television service from that company, and 50% get 
both services from that company. If a household is randomly selected, what is the 
probability that it gets at least one of these two services from the company, and what 
is the probability that it gets exactly one of these services from the company? 

With A = {gets Internet service} and B = {gets TV service}, the given infor- 
mation implies that P(A) = .6, P(B) = .8, and P(A M B) = .5. The foregoing propo- 
sition now yields 


P(subscribes to at least one of the two services) 
= P(A U B) = P(A) + P(B) — PAN B)= 6+ .8- 5=.9 


The event that a household subscribes only to tv service can be written as A'M B 
[(not Internet) and TV]. Now Figure 2.4 implies that 


9 = P(A UB) = P(A) + P(A' NB) = .6 + P(A’ NB) 


from which P(A’ M B) = .3. Similarly, P(A M B’) = P(A U B) — P(B) = .1. This is 
all illustrated in Figure 2.5, from which we see that 


P(exactly one) = P(A N B') + P(A'OB)=.14+ 3= 4 


P(A B’) P(A'N B) 


Figure 2.5 Probabilities for Example 2.14 |_| 


The addition rule for a triple union probability is similar to the foregoing rule. 
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PROPOSITION For any three events A, B, and C, 
P(AUBUO) = P(A) + P(B) + P(C) — PAN B)-— PIANC) 
= )UB NE) se THON (28 (NG) 


This can be verified by examining the Venn diagram of A U B U C, shown in Figure 
2.6. When P(A), P(B), and P(C) are added, the intersection probabilities P(A M B), 
P(A 1 C), and P(B M C) are all counted twice. Each one must therefore be subtracted. 
But then P(A M B | C) has been added in three times and subtracted out three times, 
so it must be added back. In general, the probability of a union of k events is obtained 
by summing individual event probabilities, subtracting double intersection probabilities, 
adding triple intersection probabilities, subtracting quadruple intersection probabilities, 
and so on. 


Figure 2.6 AUBUC 


Determining Probabilities Systematically 


Consider a sample space that is either finite or “countably infinite” (the latter means 
that outcomes can be listed in an infinite sequence, so there is a first outcome, a second 
outcome, a third outcome, and so on—for example, the battery testing scenario of Exam- 
ple 2.12). Let E,, E,, E;,... denote the corresponding simple events, each consisting of a 
single outcome. A sensible strategy for probability computation is to first determine each 
simple event probability, with the requirement that P(E,) = 1. Then the probability 
of any compound event A is computed by adding together the P(E;)’s for all E,’s in A: 
PA)= > PE) 
all Es in A 

EXAMPLE 2.15 During off-peak hours a commuter train has five cars. Suppose a commuter is twice 
as likely to select the middle car (#3) as to select either adjacent car (#2 or #4), 
and is twice as likely to select either adjacent car as to select either end car (#1 
or #5). Let p, = P(car iis selected) = P(E;). Then we have p, = 2p, = 2p, and 
P> = 2p, = 2p; = p4. This gives 


l= SP(E) =p, + 2p, + 4p, + 2p, + p, = 10p, 


implying p, = ps = .1, py = py = .2, p3 = .4. The probability that one of the three 
middle cars is selected (a compound event) is then p, + p; + p, = .8. B 


Equally Likely Outcomes 


In many experiments consisting of N outcomes, it is reasonable to assign equal prob- 
abilities to all NV simple events. These include such obvious examples as tossing a fair 
coin or fair die once or twice (or any fixed number of times), or selecting one or 
several cards from a well-shuffled deck of 52. With p = P(E;) for every i, 


N N 1 
1= DPE) = Le=p-N sop=5 


i=1 i=1 


That is, if there are N equally likely outcomes, the probability for each is I/N. 
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Now consider an event A, with N(A) denoting the number of outcomes con- 

tained in A. Then 

P(A) = >) P(E) = “ = ae 

E,in A E,inA 
Thus when outcomes are equally likely, computing probabilities reduces to 
counting: determine both the number of outcomes M(A) in A and the number of 

outcomes N in &, and form their ratio. 
EXAMPLE 2.16 You have six unread mysteries and six unread science fiction books on your bookshelf. 
The first three of each type are hardcover, and the last three are paperback. Consider 
randomly selecting one of the six mysteries and then randomly selecting one of the six 
science fiction books to take on a post-finals vacation to Acapulco (after all, you need 
something to read on the beach). Number the mysteries 1, 2,..., 6, and do the same for 
the science fiction books. Then each outcome is a pair of numbers such as (4, 1), and 
there are N = 36 possible outcomes (For a visual of this situation, refer to the table in 
Example 2.3 and delete the first row and column). With random selection as described, 
the 36 outcomes are equally likely. Nine of these outcomes are such that both selected 
books are paperbacks (those in the lower right-hand corner of the referenced table): (4,4), 
(4,5),..., (6,6). So the probability of the event A that both selected books are paperbacks is 
NA) 9 


P(A) = —— = = 25 = 
ON 36 


EXERCISES Section 2.2 (11-28) 


11. A mutual fund company offers its customers a variety Visa card and B be the analogous event for MasterCard. 
of funds: a money-market fund, three different bond Suppose that P(A) = .6 and P(B) = .4. 
funds (short, intermediate, and long-term), two stock a. Could it be the case that P(A M B) = .5? Why or why 
funds (moderate and high-risk), and a balanced fund. not? [Hint: See Exercise 24.] 
Among customers who own shares in just one fund, b. From now on, suppose that P(A M B) = .3. What is 
the percentages of customers in the different funds are the probability that the selected student has at least 
as follows: one of these two types of cards? 
Money-market 20% High-risk stock 18% c. Lis is the hehe that the selected student has 
Short bond 15% Moderate-risk oe ‘yp . oo 
ack. 25% d. Describe, in terms of A and B, the event that the 
Infetinediate Balanced 1% selected student has a Visa card but nota MasterCard, 
oad 10% and then calculate the probability of this event. 
Long bond 5% e. Calculate the probability that the selected student has 
a exactly one of the two types of cards. 
Peueormer wile Cys cheney An. just ane ryuad team 13. A computer consulting firm presently has bids out on three 
domly selected. ; _ ne —_ 
: ee tues projects. Let A; = {awarded project i}, fori = 1, 2, 3, and 
a. What is the probability that the oe individual suppose that P(A,) =.22, P(A,) =.25, P(A;) = .28, 
owns shares in the pene fund? a P(A, NA,) = .11, P(A, 0A;) = .05, P(A, N.A;) = .07, 
b. What is the probability that the individual owns P(A, M.A, 1A) = .01. Express in words each of the fol- 
shares in a bond fund? lowing events, and compute the probability of each event: 
c. What is the probability that the selected individual a. A, UA, 
i ? # e ui , * , 
does not own shares in a stock fund? AIM AL! [Hint: (A, UA,)' =A, NAS] 
12. Consider randomly selecting a student at a large univer- d. Ai NA, NA; 


sity, and let A be the event that the selected student has a 
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b. 
& A WA, UA, 
e. 


A TIAL OA, f. (A, NAL) UA, 


14. 


15. 


16. 


17. 


18. 


19. 


Suppose that 55% of all adults regularly consume coffee, 

45% regularly consume carbonated soda, and 70% regu- 

larly consume at least one of these two products. 

a. What is the probability that a randomly selected 
adult regularly consumes both coffee and soda? 

b. What is the probability that a randomly selected 
adult doesn’t regularly consume at least one of these 
two products? 


Consider the type of clothes dryer (gas or electric) pur- 

chased by each of five different customers at a certain store. 

a. If the probability that at most one of these purchases 
an electric dryer is .428, what is the probability that 
at least two purchase an electric dryer? 

b. If P(all five purchase gas) = .116 and P(all five pur- 
chase electric) = .005, what is the probability that at 
least one of each type is purchased? 


An individual is presented with three different glasses of 

cola, labeled C, D, and P. He is asked to taste all three 

and then list them in order of preference. Suppose the 

same cola has actually been put into all three glasses. 

a. What are the simple events in this ranking experiment, 
and what probability would you assign to each one? 

. What is the probability that C is ranked first? 

c. What is the probability that C is ranked first and D is 

ranked last? 


Let A denote the event that the next request for assis- 
tance from a statistical software consultant relates to 
the SPSS package, and let B be the event that the next 
request is for help with SAS. Suppose that P(A) = .30 
and P(B) = .50. 

a. Why is it not the case that P(A) + P(B) = 1? 

b. Calculate P(A’). 

ce. Calculate P(A U B). 

d. Calculate P(A’ N B’). 


A wallet contains five $10 bills, four $5 bills, and six 
$1 bills (nothing larger). If the bills are selected one by 
one in random order, what is the probability that at least 
two bills must be selected to obtain a first $10 bill? 


Human visual inspection of solder joints on printed circuit 
boards can be very subjective. Part of the problem stems 
from the numerous types of solder defects (e.g., pad non- 
wetting, knee visibility, voids) and even the degree to which 
a joint possesses one or more of these defects. Consequently, 
even highly trained inspectors can disagree on the disposi- 
tion of a particular joint. In one batch of 10,000 joints, 
inspector A found 724 that were judged defective, inspec- 
tor B found 751 such joints, and 1159 of the joints were 
judged defective by at least one of the inspectors. Suppose 
that one of the 10,000 joints is randomly selected. 
a. What is the probability that the selected joint was 
judged to be defective by neither of the two inspectors? 
b. What is the probability that the selected joint was 
judged to be defective by inspector B but not by 
inspector A? 


20. 


21. 


22. 
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A certain factory operates three different shifts. Over the 
last year, 200 accidents have occurred at the factory. 
Some of these can be attributed at least in part to unsafe 
working conditions, whereas the others are unrelated to 
working conditions. The accompanying table gives the 
percentage of accidents falling in each type of accident— 
shift category. 


Unsafe Unrelated 
Conditions to Conditions 
Day 10% 35% 
Shift Swing 8% 20% 
Night 5% 22% 


Suppose one of the 200 accident reports is randomly 

selected from a file of reports, and the shift and type of 

accident are determined. 

a. What are the simple events? 

b. What is the probability that the selected accident was 
attributed to unsafe conditions? 

c. What is the probability that the selected accident did 
not occur on the day shift? 


An insurance company offers four different deductible 
levels—none, low, medium, and high—for its homeowner’s 
policyholders and three different levels—low, medium, 
and high—for its automobile policyholders. The accom- 
panying table gives proportions for the various categories 
of policyholders who have both types of insurance. For 
example, the proportion of individuals with both low 
homeowner’s deductible and low auto deductible is .06 
(6% of all such individuals). 


Homeowner’s 
Auto N L M H 
L 04 .06 05 .03 
M .O7 10 .20 10 
H .02 03 ADS 5 


Suppose an individual having both types of policies is 

randomly selected. 

a. What is the probability that the individual has a 
medium auto deductible and a high homeowner’s 
deductible? 

b. What is the probability that the individual has a low 
auto deductible? A low homeowner’s deductible? 

c. What is the probability that the individual is in the same 
category for both auto and homeowner’s deductibles? 

d. Based on your answer in part (c), what is the proba- 
bility that the two categories are different? 

e. What is the probability that the individual has at least 
one low deductible level? 

f. Using the answer in part (e), what is the probability 
that neither deductible level is low? 


The route used by a certain motorist in commuting to 
work contains two intersections with traffic signals. The 
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23. 


24. 


25. 
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probability that he must stop at the first signal is .4, the 
analogous probability for the second signal is .5, and the 
probability that he must stop at at least one of the two 
signals is .7. What is the probability that he must stop 
a. At both signals? 

b. At the first signal but not at the second one? 

c. At exactly one signal? 


The computers of six faculty members in a certain depart- 

ment are to be replaced. Two of the faculty members have 

selected laptop machines and the other four have chosen 

desktop machines. Suppose that only two of the setups can 

be done on a particular day, and the two computers to be 

set up are randomly selected from the six (implying 15 

equally likely outcomes; if the computers are numbered 

1, 2,..., 6, then one outcome consists of computers 1 and 

2, another consists of computers 1 and 3, and so on). 

a. What is the probability that both selected setups are 
for laptop computers? 

b. What is the probability that both selected setups are 
desktop machines? 

c. What is the probability that at least one selected 
setup is for a desktop computer? 

d. What is the probability that at least one computer of 
each type is chosen for setup? 


Show that if one event A is contained in another event B 
(i.e., A is a subset of B), then P(A) = P(B). [Hint: For 
such A and B, A and BMA’ are disjoint and B= 

A U (BN A’), as can be seen from a Venn diagram.] For 
general A and B, what does this imply about the relation- 
ship among P(A M B), P(A) and P(A U B)? 


The three most popular options on a certain type of new 
car are a built-in GPS (A), a sunroof (B), and an auto- 
matic transmission (C). If 40% of all purchasers request 
A, 55% request B, 70% request C, 63% request A or B, 
77% request A or C, 80% request B or C, and 85% 
request A or B or C, determine the probabilities of the 
following events. [Hint: “A or B” is the event that at least 
one of the two options is requested; try drawing a Venn 
diagram and labeling all regions. ] 

a. The next purchaser will request at least one of the 

three options. 
b. The next purchaser will select none of the three options. 


26. 


27. 


28. 


c. The next purchaser will request only an automatic 
transmission and not either of the other two options. 

d. The next purchaser will select exactly one of these 
three options. 


A certain system can experience three different types of 
defects. Let A,(i = 1,2,3) denote the event that the sys- 
tem has a defect of type i. Suppose that 


P(A,) = .12 P(A,) =.07 P(A,) = .05 
P(A, UA,) = .13 P(A, UA,) = .14 
P(A, UA,) = .10 P(A, NA, MA,) = .01 


a. What is the probability that the system does not have 
a type | defect? 

b. What is the probability that the system has both type 
1 and type 2 defects? 

c. What is the probability that the system has both type 
1 and type 2 defects but not a type 3 defect? 

d. What is the probability that the system has at most 
two of these defects? 


An academic department with five faculty members— 
Anderson, Box, Cox, Cramer, and Fisher—must select 
two of its members to serve on a personnel review com- 
mittee. Because the work will be time-consuming, no 
one is anxious to serve, so it is decided that the represen- 
tatives will be selected by putting the names on identical 
pieces of paper and then randomly selecting two. 

a. What is the probability that both Anderson and Box will 
be selected? [Hint: List the equally likely outcomes.] 

b. What is the probability that at least one of the two 
members whose name begins with C is selected? 

c. If the five faculty members have taught for 3, 6, 7, 10, 
and 14 years, respectively, at the university, what is the 
probability that the two chosen representatives have a 
total of at least 15 years’ teaching experience there? 


In Exercise 5, suppose that any incoming individual is 

equally likely to be assigned to any of the three stations 

irrespective of where other individuals have been 

assigned. What is the probability that 

a. All three family members are assigned to the same 
station? 

b. At most two family members are assigned to the same 
station? 

c. Every family member is assigned to a different station? 


2.5 Counting Techniques 


When the various outcomes of an experiment are equally likely (the same probabil- 
ity is assigned to each simple event), the task of computing probabilities reduces to 
counting. Letting N denote the number of outcomes in a sample space and NA) rep- 
resent the number of outcomes contained in an event A, 
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N(A) 


P(A) = ae (2.1) 


2.3 Counting Techniques 67 


If a list of the outcomes is easily obtained and N is small, then N and M(A) can be 
determined without the benefit of any general counting principles. 

There are, however, many experiments for which the effort involved in 
constructing such a list is prohibitive because N is quite large. By exploiting some 
general counting rules, it is possible to compute probabilities of the form (2.1) with- 
out a listing of outcomes. These rules are also useful in many problems involving 
outcomes that are not equally likely. Several of the rules developed here will be used 
in studying probability distributions in the next chapter. 


The Product Rule for Ordered Pairs 


Our first counting rule applies to any situation in which a set (event) consists of 
ordered pairs of objects and we wish to count the number of such pairs. By an 
ordered pair, we mean that, if O, and O, are objects, then the pair (O,, O,) is differ- 
ent from the pair (O,, O,). For example, if an individual selects one airline for a trip 
from Los Angeles to Chicago and (after transacting business in Chicago) a second 
one for continuing on to New York, one possibility is (American, United), another is 
(United, American), and still another is (United, United). 


PROPOSITION If the first element or object of an ordered pair can be selected in n, ways, and 
for each of these n, ways the second element of the pair can be selected in n, 
ways, then the number of pairs is 1,7). 


An alternative interpretation involves carrying out an operation that consists of two 
stages. If the first stage can be performed in any one of n, ways, and for each such 
way there are n, ways to perform the second stage, then n,n, is the number of ways 
of carrying out the two stages in sequence. 


EXAMPLE 2.17 A homeowner doing some remodeling requires the services of both a plumbing con- 
tractor and an electrical contractor. If there are 12 plumbing contractors and 9 electrical 
contractors available in the area, in how many ways can the contractors be chosen? If 
we denote the plumbers by P,,..., P,, and the electricians by Q,, ..., Qo, then we wish 
the number of pairs of the form (P;, Q;). With n, = 12 and n, = 9, the product rule 
yields N = (12)(9) = 108 possible ways of choosing the two types of contractors. Mi 


In Example 2.17, the choice of the second element of the pair did not depend 
on which first element was chosen or occurred. As long as there is the same number 
of choices of the second element for each first element, the product rule is valid even 
when the set of possible second elements depends on the first element. 


EXAMPLE 2.18 A family has just moved to a new city and requires the services of both an obstetri- 
cian and a pediatrician. There are two easily accessible medical clinics, each having 
two obstetricians and three pediatricians. The family will obtain maximum health 
insurance benefits by joining a clinic and selecting both doctors from that clinic. 
In how many ways can this be done? Denote the obstetricians by O,, O,, O3, and 
O, and the pediatricians by P,,..., P,. Then we wish the number of pairs (O;, P,) 
for which O; and P; are associated with the same clinic. Because there are four 
obstetricians, n, = 4, and for each there are three choices of pediatrician, so n, = 3. 
Applying the product rule gives N = n,n, = 12 possible choices. oH 


In many counting and probability problems, a configuration called a tree diagram can 
be used to represent pictorially all the possibilities. The tree diagram associated with 
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Example 2.18 appears in Figure 2.7. Starting from a point on the left side of the 
diagram, for each possible first element of a pair a straight-line segment emanates 
rightward. Each of these lines is referred to as a first-generation branch. Now for any 
given first-generation branch we construct another line segment emanating from the tip 
of the branch for each possible choice of a second element of the pair. Each such line 
segment is a second-generation branch. Because there are four obstetricians, there are 
four first-generation branches, and three pediatricians for each obstetrician yields three 
second-generation branches emanating from each first-generation branch. 


Figure 2.7 Tree diagram for Example 2.18 


Generalizing, suppose there are n, first-generation branches, and for each first- 
generation branch there are n, second-generation branches. The total number of 
second-generation branches is then n,n,. Since the end of each second-generation 
branch corresponds to exactly one possible pair (choosing a first element and then 
a second puts us at the end of exactly one second-generation branch), there are n,n, 
pairs, verifying the product rule. 

The construction of a tree diagram does not depend on having the same num- 
ber of second-generation branches emanating from each first-generation branch. If 
the second clinic had four pediatricians, then there would be only three branches 
emanating from two of the first-generation branches and four emanating from each 
of the other two first-generation branches. A tree diagram can thus be used to repre- 
sent pictorially experiments other than those to which the product rule applies. 


A More General Product Rule 


If a six-sided die is tossed five times in succession rather than just twice, then each pos- 
sible outcome is an ordered collection of five numbers such as (1, 3, 1, 2, 4) or (6, 5, 
2, 2, 2). We will call an ordered collection of k objects a k-tuple (so a pair is a 2-tuple 
and a triple is a 3-tuple). Each outcome of the die-tossing experiment is then a 5-tuple. 


Product Rule for k-Tuples 


Suppose a set consists of ordered collections of k elements (k-tuples) and that 
there are n, possible choices for the first element; for each choice of the first 
element, there are n, possible choices of the second element;...; for each pos- 
sible choice of the first k — 1 elements, there are n, choices of the Ath element. 
Then there are n,n, «+--+ n, possible k-tuples. 
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An alternative interpretation involves carrying out an operation in k stages. If 
the first stage can be performed in any one of n, ways, and for each such way there 
are n, ways to perform the second stage, and for each way of performing the first two 
stages there are n, ways to perform the 3" stage, and so on, then n,n, - ++: +n, is the 
number of ways to carry out the entire k-stage operation in sequence. This more gen- 
eral rule can also be visualized with a tree diagram. For the case k = 3, simply add 
an appropriate number of 3" generation branches to the tip of each 2™ generation 
branch. If, for example, a college town has four pizza places, a theater complex with 
six screens, and three places to go dancing, then there would be four 1‘ generation 
branches, six 2" generation branches emanating from the tip of each 1* generation 
branch, and three 3 generation branches leading off each 2" generation branch. 
Each possible 3-tuple corresponds to the tip of a 3"! generation branch. 


EXAMPLE 2.19 Suppose the home remodeling job involves first purchasing several kitchen 
(Example 2.17 appliances. They will all be purchased from the same dealer, and there are 
continued) five dealers in the area. With the dealers denoted by D,,..., Ds, there are N = 
nnn, = (5)(12)(9) = 540 3-tuples of the form (D,, P,, Q,), So there are 540 ways 
to choose first an appliance dealer, then a plumbing contractor, and finally an elec- 
trical contractor. BH 


EXAMPLE 2.20 If each clinic has both three specialists in internal medicine and two general sur- 
(Example 2.18 — geons, there are n,n,n,n, = (4)(3)(3)(2) = 72 ways to select one doctor of each type 
continued) such that all doctors practice at the same clinic. a 


Permutations and Combinations 


Consider a group of n distinct individuals or objects (“‘distinct” means that there is 
some characteristic that differentiates any particular individual or object from any 
other). How many ways are there to select a subset of size k from the group? For 
example, if a Little League team has 15 players on its roster, how many ways are 
there to select 9 players to form a starting lineup? Or if a university bookstore sells 
ten different laptop computers but has room to display only three of them, in how 
many ways can the three be chosen? 

An answer to the general question just posed requires that we distinguish 
between two cases. In some situations, such as the baseball scenario, the order of 
selection is important. For example, Angela being the pitcher and Ben the catcher 
gives a different lineup from the one in which Angela is catcher and Ben is pitcher. 
Often, though, order is not important and one is interested only in which individuals 
or objects are selected, as would be the case in the laptop display scenario. 


DEFINITION An ordered subset is called a permutation. The number of permutations of 
size k that can be formed from the n individuals or objects in a group will be 
denoted by P,,,. An unordered subset is called a combination. One way to 
denote the number of combinations is C,,,, but we shall instead use notation 
that is quite common in probability books: (Z). read “n choose k.” 


The number of permutations can be determined by using our earlier counting 
rule for k-tuples. Suppose, for example, that a college of engineering has seven 
departments, which we denote by a, b, c, d, e, f and g. Each department has one rep- 
resentative on the college’s student council. From these seven representatives, one is 
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to be chosen chair, another is to be selected vice-chair, and a third will be secretary. 
How many ways are there to select the three officers? That is, how many permuta- 
tions of size 3 can be formed from the 7 representatives? To answer this question, 
think of forming a triple (3-tuple) in which the first element is the chair, the second 
is the vice-chair, and the third is the secretary. One such triple is (a, g, b), another is 
(b, g, a), and yet another is (d, f, b). Now the chair can be selected in any of n, = 7 
ways. For each way of selecting the chair, there are n, = 6 ways to select the vice- 
chair, and hence 7 X 6 = 42 (chair, vice-chair) pairs. Finally, for each way of select- 
ing a chair and vice-chair, there are n, = 5 ways of choosing the secretary. This gives 


P37 = (7)(6)() = 210 


as the number of permutations of size 3 that can be formed from 7 distinct individ- 
uals. A tree diagram representation would show three generations of branches. 

The expression for P,7, can be rewritten with the aid of factorial notation. 
Recall that 7! (read “7 factorial”) is compact notation for the descending prod- 
uct of integers (7)(6)(5)(4)(3)(2)(1). More generally, for any positive integer m, 
m! = m(m — 1)(m — 2)-+++-(2)C1). This gives 1! = 1, and we also define 0! = 1. 
Then 


(HOGA!) _ 7! 
(4!) 4! 


Pia (7)(6)(5) = 


Generalizing to arbitrary group size n and subset size k yields 


Py = Mn — 1)(n — 2)- ++ (0 — (k= 2) — (k= 1) 


Multiplying and dividing this by (n — k)! gives a compact expression for the number 
of permutations. 


n! 


PROPOSITION eee 
1 ¢ Saorae i 


EXAMPLE 2.21 There are ten teaching assistants available for grading papers in a calculus course at 
a large university. The first exam consists of four questions in increasing order of dif- 
ficulty, and the professor wishes to select a different assistant to grade each question 
(only one assistant per question). In how many ways can the assistants be chosen for 
grading? Here n = group size = 10 and k = subset size = 4. The number of permu- 
tations is 


— 10! 108 | 7 
Paso = Gamal = G1 7 WONG = 5040 


That is, the professor could give 5040 different four-question exams without using 
the same assignment of graders to questions, by which time all the teaching assis- 
tants would hopefully have finished their degree programs! a 


Now let’s move on to combinations (i.e., unordered subsets). Again refer to the 
student council scenario, and suppose that three of the seven representatives are to 
be selected to attend a statewide convention. The order of selection is not important; 
all that matters is which three get selected. So we are looking for Gy, the number of 
combinations of size 3 that can be formed from the 7 individuals. Consider for a 
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moment the combination a,c,g. These three individuals can be ordered in 3! = 6 
ways to produce permutations: 


B08 G8,0 CA8 C.2,.8 Bae 8¢,a 


Similarly, there are 3! = 6 ways to order the combination b,c,e to produce permuta- 
tions, and in fact 3! ways to order any particular combination of size 3 to produce 
permutations. This implies the following relationship between the number of com- 
binations and the number of permutations: 


; -0-(7)=(3) Pay 7 (7)(6)(5) 
ae ee Ae 3) 3!) BN). BYQA) 


It would not be too difficult to list the 35 combinations, but there is no need to do so 
if we are interested only in how many there are. Notice that the number of permuta- 
tions 210 far exceeds the number of combinations; the former is larger than the latter 
by a factor of 3! since that is how many ways each combination can be ordered. 

Generalizing the foregoing line of reasoning gives a simple relationship 
between the number of permutations and the number of combinations that yields a 
concise expression for the latter quantity. 


PROPOSITION @ ee et 


Notice that (7) = 1 and () = 1 since there is only one way to choose a set of 
(all) n elements or of no elements, and (7) =n since there are n subsets of size 1. 


EXAMPLE 2.22 A particular iPod playlist contains 100 songs, 10 of which are by the Beatles. 
Suppose the shuffle feature is used to play the songs in random order (the ran- 
domness of the shuffling process is investigated in ‘‘Does Your iPod Really Play 
Favorites?” (The Amer. Statistician, 2009: 263-268). What is the probability that 
the first Beatles song heard is the fifth song played? 

In order for this event to occur, it must be the case that the first four songs 
played are not Beatles’ songs (NBs) and that the fifth song is by the Beatles (B). The 
number of ways to select the first five songs is 100(99)(98)(97)(96). The number of 
ways to select these five songs so that the first four are NBs and the next is a B is 
90(89)(88)(87)(10). The random shuffle assumption implies that any particular set of 
5 songs from amongst the 100 has the same chance of being selected as the first five 
played as does any other set of five songs; each outcome is equally likely. Therefore 
the desired probability is the ratio of the number of outcomes for which the event of 
interest occurs to the number of possible outcomes: 


90 - 89 - 88-87-10 — Pa.oo ° (10) 
100 - 99 - 98 - 97 - 96 Pgs 


P(1"B is the 5" song played) = = .0679 

Here is an alternative line of reasoning involving combinations. Rather than focus- 
ing on selecting just the first five songs, think of playing all 100 songs in random 
order. The number of ways of choosing 10 of these songs to be the Bs (without 
regard to the order in which they are then played) is Cy Now if we choose 9 of the 
last 95 songs to be Bs, which can be done in (eo) ways, that leaves four NBs and one 
B for the first five songs. There is only one further way for these five to start with 
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EXAMPLE 2.23 


four NBs and then follow with a B (remember that we are considering unordered 


subsets). Thus 
() 
PB is the 5" song played) = nee 
100 
10 
It is easily verified that this latter expression is in fact identical to the first expres- 
sion for the desired probability, so the numerical result is again .0679. 
The probability that one of the first five songs played is a Beatles’ song is 
PB is the 1% or 2™ or 34 or 4" or 5 song played) 
99 98 97 96 95 
9 9 9 9 9 
= + + + = 


100 100 100 100 100 

10 10 10 10 10 
It is thus rather likely that a Beatles’ song will be one of the first five songs played. 
Such a “coincidence” is not as surprising as might first appear to be the case. o 


A university warehouse has received a shipment of 25 printers, of which 10 are laser 
printers and 15 are inkjet models. If 6 of these 25 are selected at random to be 
checked by a particular technician, what is the probability that exactly 3 of those 
selected are laser printers (so that the other 3 are inkjets)? 

Let D, = {exactly 3 of the 6 selected are inkjet printers}. Assuming that any 
particular set of 6 printers is as likely to be chosen as is any other set of 6, we have 
equally likely outcomes, so P(D;) = N(D3)/N, where N is the number of ways 
of choosing 6 printers from the 25 and N(D,) is the number of ways of choosing 
3 laser printers and 3 inkjet models. Thus N = (> ). To obtain M(D;), think of first 
choosing 3 of the 15 inkjet models and then 3 of the laser printers. There are ) 
ways of choosing the 3 inkjet models, and there are (‘?) ways of choosing the 3 laser 
printers; V(D,) is now the product of these two numbers (visualize a tree diagram— 
we are really using a product rule argument here), so 


15\/ 10 15! 10! 
N(D;) 3 /\ 3 3112! 317! 
N  (95\—s—~<C«<SSS: 
6 6!19! 
Let D, = {exactly 4 of the 6 printers selected are inkjet models} and define D; and 


D, in an analogous manner. Then the probability that at least 3 inkjet printers are 
selected is 


P(D;) = = 3083 


P(D, U D, UD; U De) = P(D;) + P(D,) + P(Ds) + P(Dg) 


(SJ) (EME) (S00) (60) 


_ + + + = .8530 


(C) ) Ge) 8) . 
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EXERCISES Section 2.3 (29—44) 


29. 


30. 


31. 


32. 


As of April 2006, roughly 50 million .com web domain 

names were registered (e.g., yahoo.com). 

a. How many domain names consisting of just two let- 
ters in sequence can be formed? How many domain 
names of length two are there if digits as well as 
letters are permitted as characters? [Note: A charac- 
ter length of three or more is now mandated.] 

b. How many domain names are there consisting of 
three letters in sequence? How many of this length 
are there if either letters or digits are permitted? 
[Note: All are currently taken.] 

c. Answer the questions posed in (b) for four-character 
sequences. 

d. As of April 2006, 97,786 of the four-character se- 
quences using either letters or digits had not yet been 
claimed. If a four-character name is randomly selected, 
what is the probability that it is already owned? 


A friend of mine is giving a dinner party. His current 
wine supply includes 8 bottles of zinfandel, 10 of merlot, 
and 12 of cabernet (he only drinks red wine), all from 
different wineries. 

a. If he wants to serve 3 bottles of zinfandel and serving 
order is important, how many ways are there to do 
this? 

b. If 6 bottles of wine are to be randomly selected from 
the 30 for serving, how many ways are there to do 
this? 

c. If 6 bottles are randomly selected, how many ways 
are there to obtain two bottles of each variety? 

d. If 6 bottles are randomly selected, what is the proba- 
bility that this results in two bottles of each variety 
being chosen? 

e. If 6 bottles are randomly selected, what is the proba- 
bility that all of them are the same variety? 


The composer Beethoven wrote 9 symphonies, 5 piano 
concertos (music for piano and orchestra), and 32 piano 
sonatas (music for solo piano). 

a. How many ways are there to play first a Beethoven 
symphony and then a Beethoven piano concerto? 

b. The manager of a radio station decides that on each 
successive evening (7 days per week), a Beethoven 
symphony will be played followed by a Beethoven 
piano concerto followed by a Beethoven piano sonata. 
For how many years could this policy be continued 
before exactly the same program would have to be 
repeated? 


An electronics store is offering a special price on a com- 
plete set of components (receiver, compact disc player, 
speakers, turntable). A purchaser is offered a choice of 
manufacturer for each component: 


33. 


34. 


35. 


Receiver: Kenwood, Onkyo, Pioneer, Sony, Sherwood 
Compact disc player: Onkyo, Pioneer, Sony, Technics 
Speakers: Boston, Infinity, Polk 

Turntable: Onkyo, Sony, Teac, Technics 


A switchboard display in the store allows a customer to 

hook together any selection of components (consisting 

of one of each type). Use the product rules to answer the 
following questions: 

a. In how many ways can one component of each type 
be selected? 

b. In how many ways can components be selected if 
both the receiver and the compact disc player are to 
be Sony? 

c. In how many ways can components be selected if none 
is to be Sony? 

d. In how many ways can a selection be made if at least 
one Sony component is to be included? 

e. If someone flips switches on the selection in a com- 
pletely random fashion, what is the probability that 
the system selected contains at least one Sony com- 
ponent? Exactly one Sony component? 


Again consider a Little League team that has 15 players 

on its roster. 

a. How many ways are there to select 9 players for the 
starting lineup? 

b. How many ways are there to select 9 players for the 
starting lineup and a batting order for the 9 starters? 

c. Suppose 5 of the 15 players are left-handed. How 
many ways are there to select 3 left-handed outfielders 
and have all 6 other positions occupied by right-handed 
players? 


Computer keyboard failures can be attributed to electri- 
cal defects or mechanical defects. A repair facility cur- 
rently has 25 failed keyboards, 6 of which have electrical 
defects and 19 of which have mechanical defects. 

a. How many ways are there to randomly select 5 of these 
keyboards for a thorough inspection (without regard to 
order)? 

b. In how many ways can a sample of 5 keyboards be 
selected so that exactly two have an electrical defect? 

c. Ifasample of 5 keyboards is randomly selected, what 
is the probability that at least 4 of these will have a 
mechanical defect? 


A production facility employs 10 workers on the day 
shift, 8 workers on the swing shift, and 6 workers on the 
graveyard shift. A quality control consultant is to select 
5 of these workers for in-depth interviews. Suppose the 
selection is made in such a way that any particular group 
of 5 workers has the same chance of being selected as 
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36. 


37. 


38. 


39. 


CHAPTER 2 Probability 


does any other group (drawing 5 slips without replace- 

ment from among 24). 

a. How many selections result in all 5 workers coming 
from the day shift? What is the probability that all 
5 selected workers will be from the day shift? 

b. What is the probability that all 5 selected workers will 
be from the same shift? 

c. What is the probability that at least two different shifts 
will be represented among the selected workers? 

d. What is the probability that at least one of the shifts 
will be unrepresented in the sample of workers? 


An academic department with five faculty members nar- 
rowed its choice for department head to either candidate 
A or candidate B. Each member then voted on a slip of 
paper for one of the candidates. Suppose there are actu- 
ally three votes for A and two for B. If the slips are 
selected for tallying in random order, what is the proba- 
bility that A remains ahead of B throughout the vote 
count (e.g., this event occurs if the selected ordering is 
AABAB, but not for ABBAA)? 


An experimenter is studying the effects of temperature, 

pressure, and type of catalyst on yield from a certain 

chemical reaction. Three different temperatures, four 
different pressures, and five different catalysts are under 
consideration. 

a. If any particular experimental run involves the use of 
a single temperature, pressure, and catalyst, how 
many experimental runs are possible? 

b. How many experimental runs are there that involve use 
of the lowest temperature and two lowest pressures? 

c. Suppose that five different experimental runs are to 
be made on the first day of experimentation. If the 
five are randomly selected from among all the possi- 
bilities, so that any group of five has the same prob- 
ability of selection, what is the probability that a 
different catalyst is used on each run? 


A sonnet is a 14-line poem in which certain rhyming 
patterns are followed. The writer Raymond Queneau 
published a book containing just 10 sonnets, each on a 
different page. However, these were structured such that 
other sonnets could be created as follows: the first line of 

a sonnet could come from the first line on any of the 10 

pages, the second line could come from the second line 

on any of the 10 pages, and so on (successive lines were 
perforated for this purpose). 

a. How many sonnets can be created from the 10 in the 
book? 

b. If one of the sonnets counted in part (a) is selected at 
random, what is the probability that none of its lines 
came from either the first or the last sonnet in the 
book? 


A box in a supply room contains 15 compact fluorescent 
lightbulbs, of which 5 are rated 13-watt, 6 are rated 
18-watt, and 4 are rated 23-watt. Suppose that three of 
these bulbs are randomly selected. 
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40. 


41. 


42. 


a. What is the probability that exactly two of the 
selected bulbs are rated 23-watt? 

b. What is the probability that all three of the bulbs 
have the same rating? 

c. What is the probability that one bulb of each type is 
selected? 

d. If bulbs are selected one by one until a 23-watt bulb 
is obtained, what is the probability that it is neces- 
sary to examine at least 6 bulbs? 


Three molecules of type A, three of type B, three of type 

C, and three of type D are to be linked together to form 

a chain molecule. One such chain molecule is 

ABCDABCDABCD, and another is BCDDAAABDBCC. 

a. How many such chain molecules are there? [Hint: If 
the three A’s were distinguishable from one another— 
A,, Aj, A;—and the B’s, C’s, and D’s were also, how 
many molecules would there be? How is this number 
reduced when the subscripts are removed from the 
A’s?] 

b. Suppose a chain molecule of the type described is 
randomly selected. What is the probability that all 
three molecules of each type end up next to one 
another (such as in BBBAAADDDCCC)? 


An ATM personal identification number (PIN) consists 

of four digits, each a 0, 1, 2,... 8, or 9, in succession. 

a. How many different possible PINs are there if there 
are no restrictions on the choice of digits? 

b. According to a representative at the author’s local 
branch of Chase Bank, there are in fact restrictions on 
the choice of digits. The following choices are pro- 
hibited: (i) all four digits identical (ii) sequences of 
consecutive ascending or descending digits, such as 
6543 (iii) any sequence starting with 19 (birth years 
are too easy to guess). So if one of the PINs in (a) is 
randomly selected, what is the probability that it will 
be a legitimate PIN (that is, not be one of the prohib- 
ited sequences)? 

c. Someone has stolen an ATM card and knows that 
the first and last digits of the PIN are 8 and 1, 
respectively. He has three tries before the card is 
retained by the ATM (but does not realize that). So 
he randomly selects the 2" and 3" digits for the 
first try, then randomly selects a different pair of 
digits for the second try, and yet another randomly 
selected pair of digits for the third try (the individ- 
ual knows about the restrictions described in (b) so 
selects only from the legitimate possibilities). What 
is the probability that the individual gains access to 
the account? 

d. Recalculate the probability in (c) if the first and last 
digits are 1 and 1, respectively. 


A starting lineup in basketball consists of two guards, 

two forwards, and a center. 

a. A certain college team has on its roster three centers, 
four guards, four forwards, and one individual (X) 
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who can play either guard or forward. How many 43. In five-card poker, a straight consists of five cards 
different starting lineups can be created? [Hint: with adjacent denominations (e.g., 9 of clubs, 10 of 
Consider lineups without X, then lineups with X as hearts, jack of hearts, queen of spades, and king of 
guard, then lineups with X as forward.] clubs). Assuming that aces can be high or low, if you are 
b. Now suppose the roster has 5 guards, 5 forwards, dealt a five-card hand, what is the probability that it will 
3 centers, and 2 “swing players” (X and Y) who be a straight with high card 10? What is the probability 
can play either guard or forward. If 5 of the 15 that it will be a straight? What is the probability that it 
players are randomly selected, what is the proba- will be a straight flush (all cards in the same suit)? 


bility that they constitute a legitimate starting 44. Show that (7) 


lineup? 


= ( Pais zy Give an interpretation involving 
subsets. 
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EXAMPLE 2.24 


The probabilities assigned to various events depend on what is known about the exper- 
imental situation when the assignment is made. Subsequent to the initial assignment, 
partial information relevant to the outcome of the experiment may become available. 
Such information may cause us to revise some of our probability assignments. For a 
particular event A, we have used P(A) to represent the probability, assigned to A; we 
now think of P(A) as the original, or unconditional probability, of the event A. 

In this section, we examine how the information “an event B has occurred” affects 
the probability assigned to A. For example, A might refer to an individual having a par- 
ticular disease in the presence of certain symptoms. If a blood test is performed on the 
individual and the result is negative (B = negative blood test), then the probability of 
having the disease will change (it should decrease, but not usually to zero, since blood 
tests are not infallible). We will use the notation P(A|B) to represent the conditional 
probability of A given that the event B has occurred. B is the “conditioning event.” 

As an example, consider the event A that a randomly selected student at your 
university obtained all desired classes during the previous term’s registration cycle. 
Presumably P(A) is not very large. However, suppose the selected student is an ath- 
lete who gets special registration priority (the event B). Then P(A|B) should be sub- 
stantially larger than P(A), although perhaps still not close to 1. 


Complex components are assembled in a plant that uses two different assembly 
lines, A and A’. Line A uses older equipment than A’, so it is somewhat slower and 
less reliable. Suppose on a given day line A has assembled 8 components, of which 
2 have been identified as defective (B) and 6 as nondefective (B’), whereas A’ has 
produced 1 defective and 9 nondefective components. This information is summa- 
rized in the accompanying table. 


Condition 

B B' 
2 2 6 
Line Al 1 9 


Unaware of this information, the sales manager randomly selects | of these 18 com- 
ponents for a demonstration. Prior to the demonstration 
NA) 8 


P(line A component selected) = P(A) N 18 44 
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However, if the chosen component turns out to be defective, then the event B has 
occurred, so the component must have been | of the 3 in the B column of the table. 
Since these 3 components are equally likely among themselves after B has occurred, 


2 2/18 P(ANB) 
3 3/18 ~=—~P(B) 


P(A|B) = (2.2) 


In Equation (2.2), the conditional probability is expressed as a ratio of uncon- 
ditional probabilities: The numerator is the probability of the intersection of the two 
events, whereas the denominator is the probability of the conditioning event B. A 
Venn diagram illuminates this relationship (Figure 2.8). 


CW, 


Figure 2.8 Motivating the definition of conditional probability 


A 


Given that B has occurred, the relevant sample space is no longer £ but con- 
sists of outcomes in B; A has occurred if and only if one of the outcomes in the 
intersection occurred, so the conditional probability of A given B is proportional to 
P(A M B). The proportionality constant 1/P(B) is used to ensure that the probability 
P(B|B) of the new sample space B equals 1. 


The Definition of Conditional Probability 


Example 2.24 demonstrates that when outcomes are equally likely, computation 
of conditional probabilities can be based on intuition. When experiments are more 
complicated, though, intuition may fail us, so a general definition of conditional 
probability is needed that will yield intuitive answers in simple problems. The Venn 
diagram and Equation (2.2) suggest how to proceed. 


DEFINITION For any two events A and B with P(B) > 0, the conditional probability of A 
given that B has occurred is defined by 


P(AN B) 


P(A|B) = PB) 


(23) 


EXAMPLE 2.25 Suppose that of all individuals buying a certain digital camera, 60% include an 
optional memory card in their purchase, 40% include an extra battery, and 30% 
include both a card and battery. Consider randomly selecting a buyer and let 
A = {memory card purchased} and B = {battery purchased}. Then P(A) = .60, 
P(B) = .40, and P(both purchased) = P(A M B) = .30. Given that the selected 
individual purchased an extra battery, the probability that an optional card was also 
purchased is 


PAN B) _ 30 


PA) P(B) 40 


= .75 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


2.4 Conditional Probability 77 


That is, of all those purchasing an extra battery, 75% purchased an optional memory 
card. Similarly, 


P(ANB) _ .30 
P(A) .60 
Notice that P(A|B) # P(A) and P(B|A) # P(B). ia 


P(battery|memory card) = P(B|A) = = 90 


The event whose probability is desired might be a union or intersection of 
other events, and the same could be true of the conditioning event. 


EXAMPLE 2.26 A news magazine publishes three columns entitled “Art” (A), “Books” (B), and 
“Cinema” (C). Reading habits of a randomly selected reader with respect to these 
columns are 


Read regularly A B ai ANB ANC BNC ANBNC 
Probability 14 23 37 .08 .09 .13 05 


Figure 2.9 illustrates relevant probabilities. 


Figure 2.9 Venn diagram for Example 2.26 
Consider the following four conditional probabilities: 
P(AMB) — .08 
PB) 23 
Gi) The probability that the selected individual regularly reads the Art column 
given that he or she regularly reads at least one of the other two columns is 
PAN(BUC)) 04+ .05+ .03 12 


(i) P(A|B) = = 348 


P(A|BUQ) = = .255 
P(BUC) AT AT 
(iii) P(A|reads at least one) = P(A|A U BU C) ae 
lil reads at least one) = = 
PAUBUQ) 
P(A A 
) = — = 286 


P(AUBUC) 49 


(iv) The probability that the selected individual reads at least one of the first two 
columns given that he or she reads the Cinema column is 


P(AUB)NC) 04+ .05 + .08 _ 
P(C) 37 


The Multiplication Rule for P(A N B) 


The definition of conditional probability yields the following result, obtained by 
multiplying both sides of Equation (2.3) by P(B). 


P(AUB|C) = 459 | 


The Multiplication Rule 
P(A 1M B) = P(A|B) - P(B) 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


78 CHAPTER 2 Probability 


This rule is important because it is often the case that P(A M B) is desired, 
whereas both P(B) and P(A|B) can be obtained from the problem description. 
Consideration of P(B|A) gives P(A M B) = P(B|A) - P(A). 


EXAMPLE 2.27 Four individuals have responded to a request by a blood bank for blood donations. 
None of them has donated before, so their blood types are unknown. Suppose only 
type O+ is desired and only one of the four actually has this type. If the potential 
donors are selected in random order for typing, what is the probability that at least 
three individuals must be typed to obtain the desired type? 

Making the identification B = {first type notO+} and A = {second type not 
O+}, P(B) = 3/4. Given that the first type is not O+, two of the three individuals 
left are not O+, so P(A|B) = 2/3. The multiplication rule now gives 


P(at least three individuals are typed) = P(A M B) 


= P(A|B) - P(B) 
23 6 
“34 12 
= 5 i 


The multiplication rule is most useful when the experiment consists of several 
stages in succession. The conditioning event B then describes the outcome of the first 
stage and A the outcome of the second, so that P(A|B)—conditioning on what occurs 
first—will often be known. The rule is easily extended to experiments involving more 
than two stages. For example, consider three events A,, A,, and A,. The triple inter- 
section of these events can be represented as the double intersection (A,M A,) M A;. 
Applying our previous multiplication rule to this intersection and then to A, M A, gives 


P(A, NA, MA;) = P(A3|A,; NA.) - P(A, NA,) 
7 P(A3|A, M As) - P(A3|A)) * P(A;) (2.4) 
Thus the triple intersection probability is a product of three probabilities, two of 
which are conditional. 
EXAMPLE 2.28 For the blood-typing experiment of Example 2.27, 


P(third type is O+) = P(third is | first isn’t M second isn’t) 
first isn’t) - P(first isn’t) 


- P(second isn’t 


1 2 1 
SF eg ees a 


When the experiment of interest consists of a sequence of several stages, it is 
convenient to represent these with a tree diagram. Once we have an appropriate tree 
diagram, probabilities and conditional probabilities can be entered on the various 
branches; this will make repeated use of the multiplication rule quite straightforward. 


EXAMPLE 2.29 An electronics store sells three different brands of DVD players. Of its DVD player 
sales, 50% are brand | (the least expensive), 30% are brand 2, and 20% are brand 3. 
Each manufacturer offers a 1-year warranty on parts and labor. It is known that 25% 
of brand 1’s DVD players require warranty repair work, whereas the corresponding 
percentages for brands 2 and 3 are 20% and 10%, respectively. 


1. What is the probability that a randomly selected purchaser has bought a brand | 


DVD player that will need repair while under warranty? 
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2. What is the probability that a randomly selected purchaser has a DVD player 
that will need repair while under warranty? 


3. If a customer returns to the store with a DVD player that needs warranty repair 
work, what is the probability that it is a brand | DVD player? A brand 2 DVD 
player? A brand 3 DVD player? 


The first stage of the problem involves a customer selecting one of the 
three brands of DVD player. Let A; = {brand i is purchased}, for i = 1, 2, and 3. 
Then P(A,) = .50, P(A,) = .30, and P(A;) = .20. Once a brand of DVD player is 
selected, the second stage involves observing whether the selected DVD player 
needs warranty repair. With B = {needs repair} and B’ = {doesn’t need repair}, the 
given information implies that P(B|A,) = .25, P(B|A,) = .20, and P(B|A,) = .10. 

The tree diagram representing this experimental situation is shown in 
Figure 2.10. The initial branches correspond to different brands of DVD play- 
ers; there are two second-generation branches emanating from the tip of each 
initial branch, one for “needs repair” and the other for “doesn’t need repair.” 
The probability P(A;) appears on the ith initial branch, whereas the conditional 
probabilities P(B|A,) and P(B’|A,) appear on the second-generation branches. 
To the right of each second-generation branch corresponding to the occurrence 
of B, we display the product of probabilities on the branches leading out to 
that point. This is simply the multiplication rule in action. The answer to the 
question posed in | is thus P(A, M B) = P(B\A,) - P(A,) = .125. The answer to 
question 2 is 


P(B) = P{(brand 1 and repair) or (brand 2 and repair) or (brand 3 and repair)] 
= P(A, MB) + P(A, NM B) + P(A, B) 
= .125 + .060 + .020 = .205 


P(B|A,)+ P(Ay) = P(BNA,) = .125 


P(B| Ay)+ P(Ay) = P(BN Ap) = .060 


P(A3) = .30 


Brand 2 


2 
4 
De 


P(B| A3)+ P(A3) = P(BNA3) = .020 


P(B) = .205 


Figure 2.10 Tree diagram for Example 2.29 
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Finally, 
P(A, NB) 125 
P(A,|B) = = = 61 
P(B) 205 
P(A, B) 060 
P(A,|B) = ue = 29 
P(B) 205 
and 


P(A,|B) = 1 — P(A,|B) — P(A,|B) = .10 


The initial or prior probability of brand 1 is .50. Once it is known that the 
selected DVD player needed repair, the posterior probability of brand | increases to 
.61. This is because brand 1 DVD players are more likely to need warranty repair 
than are the other brands. The posterior probability of brand 3 is P(A;|B) = .10, 
which is much less than the prior probability P(A,) = .20. a 


Bayes’ Theorem 


The computation of a posterior probability P(A||B) from given prior probabili- 
ties P(A,) and conditional probabilities P(B\A,) occupies a central position in 
elementary probability. The general rule for such computations, which is really 
just a simple application of the multiplication rule, goes back to Reverend Thomas 
Bayes, who lived in the eighteenth century. To state it we first need another 
result. Recall that events A,,..., A, are mutually exclusive if no two have any common 
outcomes. The events are exhaustive if one A; must occur, so that A, U... UA, = 8. 


The Law of Total Probability 


Let A,,..., A, be mutually exclusive and exhaustive events. Then for any other 
event B, 


P(B) = P(B\A,)P(A,) + --- + P(B|APP(AD 
k 
= >) P(B|A)P(A)) (2.5) 


i=1 


Proof Because the A,’s are mutually exclusive and exhaustive, if B occurs it must be 
in conjunction with exactly one of the A,’s. That is, B = (A, N B) U... U (A, NB), 
where the events (A; M B) are mutually exclusive. This “partitioning of B” is illus- 
trated in Figure 2.11. Thus 


k k 
P(B) = >) P(A, B) = >) P(BIA) P(A) 


i=1 i=1 


as desired. 


Figure 2.11 Partition of B by mutually exclusive and exhaustive A,’s 
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EXAMPLE 2.30 An individual has 3 different email accounts. Most of her messages, in fact 70%, 
come into account #1, whereas 20% come into account #2 and the remaining 10% 
into account #3. Of the messages into account #1, only 1% are spam, whereas the 
corresponding percentages for accounts #2 and #3 are 2% and 5%, respectively. 
What is the probability that a randomly selected message is spam? 


To answer this question, let’s first establish some notation: 
A; = {message is from account ¥ i} fori = 1, 2,3, B= {message is spam} 
Then the given percentages imply that 
P(A,) = .70, P(A,) = .20, P(A3) = .10 
P(B|A,) = .01, P(B|A,) = .02, P(B|A;) = .05 


Now it is simply a matter of substituting into the equation for the law of total 
probability: 


P(B) = (.01)(.70) + (.02)(.20) + (.05)(.10) = .016 


In the long run, 1.6% of this individual’s messages will be spam. a 


Bayes’ Theorem 


Let A,, A,,..., A, be a collection of k mutually exclusive and exhaustive events 
with prior probabilities P(A;) (i = 1,...,k). Then for any other event B for 
which P(B) > 0, the posterior probability of A; given that B has occurred is 


| P(A, 7B) P(B|A,)P(A)) 
P(A;|B) = =— eae (2.6) 
P(B k 
) >) P(B|A)) « P(A) 


i=1 


The transition from the second to the third expression in (2.6) rests on using 
the multiplication rule in the numerator and the law of total probability in the 
denominator. The proliferation of events and subscripts in (2.6) can be a bit intimi- 
dating to probability newcomers. As long as there are relatively few events in the 
partition, a tree diagram (as in Example 2.29) can be used as a basis for calculating 
posterior probabilities without ever referring explicitly to Bayes’ theorem. 


EXAMPLE 2.31 Incidence of a rare disease. Only | in 1000 adults is afflicted with a rare disease for 
which a diagnostic test has been developed. The test is such that when an individual 
actually has the disease, a positive result will occur 99% of the time, whereas an 
individual without the disease will show a positive test result only 2% of the time 
(the sensitivity of this test is 99% and the specificity is 98%; in contrast, the Sept. 
22, 2012 issue of The Lancet reports that the first at-home HIV test has a sensitivity 
of only 92% and a specificity of 99.98%). If a randomly selected individual is tested 
and the result is positive, what is the probability that the individual has the disease? 

To use Bayes’ theorem, let A, = individual has the disease, A, = individual 
does not have the disease, and B= positive test result. Then P(A,) = .001, 
P(A,) = .999, P(B |A,) = .99, and P(B |A,) = .02. The tree diagram for this problem 
is in Figure 2.12. 
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P(A, B) = .00099 


P(A) NB) = .01998 


Figure 2.12 Tree diagram for the rare-disease problem 


Next to each branch corresponding to a positive test result, the multiplication 
tule yields the recorded probabilities. Therefore, P(B) = .00099 + .01998 = .02097, 
from which we have 
P(A, B) _ 00099 
P(B) .02097 


P(A,|B) = 047 

This result seems counterintuitive; the diagnostic test appears so accurate that we 
expect someone with a positive test result to be highly likely to have the disease, 
whereas the computed conditional probability is only .047. However, the rarity of the 
disease implies that most positive test results arise from errors rather than from dis- 
eased individuals. The probability of having the disease has increased by a multiplica- 
tive factor of 47 (from prior .001 to posterior .047); but to get a further increase in the 


posterior probability, a diagnostic test with much smaller error rates is needed. a 
EXERCISES Section 2.4 (45-69) 

45. The population of a particular country consists of three c. If the selected individual does not have type B blood, 
ethnic groups. Each individual belongs to one of the four what is the probability that he or she is from ethnic 
major blood groups. The accompanying joint probability group 1? 
table gives the proportions of individuals in the various 46. Suppose an individual is randomly selected from the 


ethnic group-blood group combinations. population of all adult males living in the United States. 


Let A be the event that the selected individual is over 6 ft 
in height, and let B be the event that the selected individ- 
Oo A B AB ual is a professional basketball player. Which do you 
think is larger, P(A|B) or P(B|A)? Why? 


Blood Group 


1 082 .106 008 004 47. Return to the credit card scenario of Exercise 12 (Sec- 
Ethnic Group 2 135 141 018 .006 tion 2.2), and let C be the event that the selected student 
3 215 .200 .065 020 has an American Express card. In addition to P(A) = .6, 
P(B) = 4, and P(A M B) = .3, suppose that P(C) = .2, 
PANO =.15, (BN © =.1, and P(A N BNC) =.08. 
Suppose that an individual is randomly selected from a. What is the probability that the selected student has 
the population, and define events by A = {type at least one of the three types of cards? 
A selected}, B= {type B selected}, and C = {ethnic b. What is the probability that the selected student has 
group 3 selected}. both a Visa card and a MasterCard but not an American 
a. Calculate P(A), P(C), and P(A NM C). Express card? 
b. Calculate both P(A|C) and P(C|A), and explain in ¢c. Calculate and interpret P(B|A) and also P(A|B). 


context what each of these probabilities represents. 
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49. 


50. 


d. If we learn that the selected student has an American 
Express card, what is the probability that she or he 
also has both a Visa card and a MasterCard? 

e. Given that the selected student has an American Express 
card, what is the probability that she or he has at least 
one of the other two types of cards? 


Reconsider the system defect situation described in 

Exercise 26 (Section 2.2). 

a. Given that the system has a type 1 defect, what is the 
probability that it has a type 2 defect? 

b. Given that the system has a type | defect, what is the 
probability that it has all three types of defects? 

c. Given that the system has at least one type of defect, 
what is the probability that it has exactly one type of 
defect? 

d. Given that the system has both of the first two types 
of defects, what is the probability that it does not 
have the third type of defect? 


The accompanying table gives information on the type of 
coffee selected by someone purchasing a single cup at a 
particular airport kiosk. 


Small Medium Large 
Regular 14% 20% 26% 
Decaf 20% 10% 10% 


Consider randomly selecting such a coffee purchaser. 

a. What is the probability that the individual purchased 
a small cup? A cup of decaf coffee? 

b. If we learn that the selected individual purchased a 
small cup, what now is the probability that he/she 
chose decaf coffee, and how would you interpret this 
probability? 

c. If we learn that the selected individual purchased 
decaf, what now is the probability that a small size 
was selected, and how does this compare to the 
corresponding unconditional probability of (a)? 


A department store sells sport shirts in three sizes (small, 
medium, and large), three patterns (plaid, print, and 
stripe), and two sleeve lengths (long and short). The 
accompanying tables give the proportions of shirts sold 
in the various category combinations. 


Short-sleeved 


Pattern 
Size Pl Pr St 
S .04 02 OS 
M .08 07 12 
L .03 .O7 08 


51. 


52. 


53: 


54, 
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Long-sleeved 


Pattern 
Size Pl Pr St 
Ss .03 .02 03 
M 10 .05 O07 
L .04 .02 08 


a. What is the probability that the next shirt sold is a 
medium, long-sleeved, print shirt? 

b. What is the probability that the next shirt sold is a 
medium print shirt? 

c. What is the probability that the next shirt sold is a 
short-sleeved shirt? A long-sleeved shirt? 

d. What is the probability that the size of the next shirt 
sold is medium? That the pattern of the next shirt 
sold is a print? 

e. Given that the shirt just sold was a short-sleeved plaid, 
what is the probability that its size was medium? 

f. Given that the shirt just sold was a medium plaid, 
what is the probability that it was short-sleeved? 
Long-sleeved? 


According to a July 31, 2013, posting on cnn.com sub- 
sequent to the death of a child who bit into a peanut, a 
2010 study in the journal Pediatrics found that 8% of 
children younger than 18 in the United States have at 
least one food allergy. Among those with food allergies, 
about 39% had a history of severe reaction. 

a. Ifachild younger than 18 is randomly selected, what 
is the probability that he or she has at least one food 
allergy and a history of severe reaction? 

b. It was also reported that 30% of those with an allergy 
in fact are allergic to multiple foods. If a child 
younger than 18 is randomly selected, what is the 
probability that he or she is allergic to multiple foods? 


A system consists of two identical pumps, #1 and #2. If 
one pump fails, the system will still operate. However, 
because of the added strain, the remaining pump is now 
more likely to fail than was originally the case. That 
is, r = P(#2 fails | #1 fails) > P(#2 fails) = q. If at least 
one pump fails by the end of the pump design life in 7% of 
all systems and both pumps fail during that period in only 
1%, what is the probability that pump #1 will fail during 
the pump design life? 


A certain shop repairs both audio and video compo- 
nents. Let A denote the event that the next component 
brought in for repair is an audio component, and let B be 
the event that the next component is a compact disc 
player (so the event B is contained in A). Suppose that 
P(A) = .6 and P(B) = .05. What is P(B|A)? 


In Exercise 13, A; = {awarded project i}, fori = 1, 2, 3. 
Use the probabilities given there to compute the 
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55. 


56. 


57. 


58. 


59. 


60. 


61. 
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following probabilities, and explain in words the 
meaning of each one. 

a. P(A,|A,) b. P(A, A3|A,) 

ec. P(A,UA,|A,;) d. P(A, NA, MA;/A, U A, UA;) 


Deer ticks can be carriers of either Lyme disease or 
human granulocytic ehrlichiosis (HGE). Based on a 
recent study, suppose that 16% of all ticks in a certain 
location carry Lyme disease, 10% carry HGE, and 10% 
of the ticks that carry at least one of these diseases in fact 
carry both of them. If a randomly selected tick is found 
to have carried HGE, what is the probability that the 
selected tick is also a carrier of Lyme disease? 


For any events A and B with P(B) > 0, show that 
P(A|B) + P(A’|B) = 1. 

If P(B|A) > P(B), show that P(B'|A) < P(B'). [Hint: 
Add P(B'|A) to both sides of the given inequality and 
then use the result of Exercise 56.] 


Show that for any three events A, B, and C with P(C) > 0, 
P(A U BIC) = P(A|C) + P(BIC) — PAN BIC). 


At acertain gas station, 40% of the customers use regular 

gas (A,), 35% use plus gas (A,), and 25% use premium 

(A). Of those customers using regular gas, only 30% fill 

their tanks (event B). Of those customers using plus, 60% 

fill their tanks, whereas of those using premium, 50% fill 

their tanks. 

a. What is the probability that the next customer will 
request plus gas and fill the tank (A,  B)? 

b. What is the probability that the next customer fills 
the tank? 

ce. If the next customer fills the tank, what is the 
probability that regular gas is requested? Plus? 
Premium? 


Seventy percent of the light aircraft that disappear while 

in flight in a certain country are subsequently discov- 

ered. Of the aircraft that are discovered, 60% have an 

emergency locator, whereas 90% of the aircraft not 

discovered do not have such a locator. Suppose a light 

aircraft has disappeared. 

a. Ifit has an emergency locator, what is the probability 
that it will not be discovered? 

b. If it does not have an emergency locator, what is the 
probability that it will be discovered? 


Components of a certain type are shipped to a supplier in 
batches of ten. Suppose that 50% of all such batches con- 
tain no defective components, 30% contain one defective 
component, and 20% contain two defective components. 
Two components from a batch are randomly selected and 
tested. What are the probabilities associated with 0, 1, and 
2 defective components being in the batch under each of 
the following conditions? 
a. Neither tested component is defective. 
b. One of the two tested components is defective. [Hint: 
Draw a tree diagram with three first-generation 
branches for the three different types of batches.] 
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Blue Cab operates 15% of the taxis in a certain city, and 
Green Cab operates the other 85%. After a nighttime hit- 
and-run accident involving a taxi, an eyewitness said the 
vehicle was blue. Suppose, though, that under night 
vision conditions, only 80% of individuals can correctly 
distinguish between a blue and a green vehicle. What is 
the (posterior) probability that the taxi at fault was blue? 
In answering, be sure to indicate which probability rules 
you are using. [Hint: A tree diagram might help. Note: 
This is based on an actual incident. | 


For customers purchasing a refrigerator at a cer- 
tain appliance store, let A be the event that the refrig- 
erator was manufactured in the U.S., B be the event 
that the refrigerator had an icemaker, and C be the 
event that the customer purchased an extended war- 
ranty. Relevant probabilities are 


P(A) =.75 P(B|A)=.9 P(B|A’) = .8 
P(CIANB)=.8 P(C|IANB’) =.6 
P(C|A'NB)=.7. P(C|A'NB’) = 3 


a. Construct a tree diagram consisting of first-, second-, 
and third-generation branches, and place an event 
label and appropriate probability next to each branch. 
Compute P(A MBM C). 

Compute P(B NM C). 

Compute P(C). 

Compute P(A|B M C), the probability of a U.S. pur- 
chase given that an icemaker and extended warranty 
are also purchased. 


eae s 


The Reviews editor for a certain scientific journal 
decides whether the review for any particular book 
should be short (1-2 pages), medium (3-4 pages), or 
long (5-6 pages). Data on recent reviews indicates that 
60% of them are short, 30% are medium, and the other 
10% are long. Reviews are submitted in either Word or 
LaTeX. For short reviews, 80% are in Word, whereas 
50% of medium reviews are in Word and 30% of long 
reviews are in Word. Suppose a recent review is ran- 
domly selected. 
a. What is the probability that the selected review was 
submitted in Word format? 
b. If the selected review was submitted in Word format, 
what are the posterior probabilities of it being short, 
medium, or long? 


A large operator of timeshare complexes requires any- 
one interested in making a purchase to first visit the 
site of interest. Historical data indicates that 20% of all 
potential purchasers select a day visit, 50% choose a 
one-night visit, and 30% opt for a two-night visit. In 
addition, 10% of day visitors ultimately make a pur- 
chase, 30% of one-night visitors buy a unit, and 20% 
of those visiting for two nights decide to buy. Suppose 
a visitor is randomly selected and is found to have 
made a purchase. How likely is it that this person made 
a day visit? A one-night visit? A two-night visit? 
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Consider the following information about travelers on 

vacation (based partly on a recent Travelocity poll): 

40% check work email, 30% use a cell phone to stay 

connected to work, 25% bring a laptop with them, 23% 

both check work email and use a cell phone to stay 

connected, and 51% neither check work email nor use 

a cell phone to stay connected nor bring a laptop. In 

addition, 88 out of every 100 who bring a laptop also 

check work email, and 70 out of every 100 who use a 

cell phone to stay connected also bring a laptop. 

a. What is the probability that a randomly selected 
traveler who checks work email also uses a cell 
phone to stay connected? 

b. What is the probability that someone who brings a 
laptop on vacation also uses a cell phone to stay 
connected? 

c. Ifthe randomly selected traveler checked work email 
and brought a laptop, what is the probability that he/ 
she uses a cell phone to stay connected? 


There has been a great deal of controversy over the last 
several years regarding what types of surveillance are 
appropriate to prevent terrorism. Suppose a particular 
surveillance system has a 99% chance of correctly iden- 
tifying a future terrorist and a 99.9% chance of correctly 
identifying someone who is not a future terrorist. If there 
are 1000 future terrorists in a population of 300 million, 
and one of these 300 million is randomly selected, scru- 
tinized by the system, and identified as a future terrorist, 
what is the probability that he/she actually is a future 
terrorist? Does the value of this probability make you 
uneasy about using the surveillance system? Explain. 


A friend who lives in Los Angeles makes frequent con- 
sulting trips to Washington, D.C.; 50% of the time she 
travels on airline #1, 30% of the time on airline #2, and 
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the remaining 20% of the time on airline #3. For airline 
#1, flights are late into D.C. 30% of the time and late into 
L.A. 10% of the time. For airline #2, these percentages 
are 25% and 20%, whereas for airline #3 the percentages 
are 40% and 25%. If we learn that on a particular trip she 
arrived late at exactly one of the two destinations, what 
are the posterior probabilities of having flown on airlines 
#1, #2, and #3? Assume that the chance of a late arrival in 
L.A. is unaffected by what happens on the flight to D.C. 
[Hint: From the tip of each first-generation branch on a 
tree diagram, draw three second-generation branches 
labeled, respectively, 0 late, 1 late, and 2 late.] 


In Exercise 59, consider the following additional infor- 
mation on credit card usage: 

70% of all regular fill-up customers use a credit card. 
50% of all regular non-fill-up customers use a credit 
card. 

60% of all plus fill-up customers use a credit card. 

50% of all plus non-fill-up customers use a credit card. 
50% of all premium fill-up customers use a credit card. 
40% of all premium non-fill-up customers use a credit 
card. 

Compute the probability of each of the following events 
for the next customer to arrive (a tree diagram might 
help). 

{plus and fill-up and credit card} 

{premium and non-fill-up and credit card} 
{premium and credit card} 

{fill-up and credit card} 

{credit card} 

If the next customer uses a credit card, what is the 
probability that premium was requested? 


mePreaoc se p 


2.9 independence 


The definition of conditional probability enables us to revise the probability P(A) 
originally assigned to A when we are subsequently informed that another event 
B has occurred; the new probability of A is P(A|B). In our examples, it frequently 
happened that P(A|B) differed from the unconditional probability P(A). Then the 
information “B has occurred” resulted in a change in the likelihood of A occurring. 
Often the chance that A will occur or has occurred is not affected by knowledge that 
B has occurred, so that P(A|B) = P(A). It is then natural to regard A and B as inde- 
pendent events, meaning that the occurrence or nonoccurrence of one event has no 
bearing on the chance that the other will occur. 


DEFINITION 


Two events A and B are independent if P(A|B) = P(A) and are dependent 
otherwise. 
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The definition of independence might seem “unsymmetric” because we do 
not also demand that P(B\A) = P(B). However, using the definition of conditional 
probability and the multiplication rule, 

P(ANB)  P(A|B)P(B) 


P(B|A) = PA) P(A) (2.7) 


The right-hand side of Equation (2.7) is P(B) if and only if P(A|B) = P(A) 
(independence). Thus the equality in the definition implies the other equality (and 
vice versa). It is also straightforward to show that if A and B are independent, then 
so are the following pairs of events: (1) A’ and B, (2) A and B’, and (3) A’ and B’. 


EXAMPLE 2.32 Consider a gas station with six pumps numbered 1, 2,..., 6, and let E; denote the simple 
event that a randomly selected customer uses pump i (i = I,..., 6). Suppose that 


P(E,) = P(E.) = .10,  PCE,) = P(E.) = .15, PCE,) = PCE) = .25 
Define events A, B, C by 
A = {2,4, 6}, B= {1, 2,3}, C= {2, 3, 4, 5}. 


We then have P(A) = .50, P(A|B) = .30, and P(A|C) = .50. That is, events A and B 
are dependent, whereas events A and C are independent. Intuitively, A and C are inde- 
pendent because the relative division of probability among even- and odd-numbered 
pumps is the same among pumps 2, 3, 4, 5 as it is among all six pumps. (= 


EXAMPLE 2.33 Let A and B be any two mutually exclusive events with P(A) > 0. For example, for 
a randomly chosen automobile, let A = {the car has a four cylinder engine} and 
B = {the car has a six cylinder engine}. Since the events are mutually exclusive, if B 
occurs, then A cannot possibly have occurred, so P(A|B) = 0 # P(A). The message 
here is that if two events are mutually exclusive, they cannot be independent. When A 
and B are mutually exclusive, the information that A occurred says something about 
B (it cannot have occurred), so independence is precluded. a 


The Multiplication Rule for P(A N B) 


Frequently the nature of an experiment suggests that two events A and B should be 
assumed independent. This is the case, for example, if a manufacturer receives a cir- 
cuit board from each of two different suppliers, each board is tested on arrival, and 
A = {first is defective} and B = {second is defective}. If P(A) = .1, it should also 
be the case that P(A |B) = .1; knowing the condition of the second board shouldn’t 
provide information about the condition of the first. The probability that both events 
will occur is easily calculated from the individual event probabilities when the 
events are independent. 


PROPOSITION A and B are independent if and only if (iff) 
P(A M B) = P(A) - P(B) (2.8) 


The verification of this multiplication rule is as follows: 


P(A M B) = P(A|B) - P(B) = P(A) - P(B) (2.9) 
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where the second equality in Equation (2.9) is valid iff A and B are independent. 
Equivalence of independence and Equation (2.8) imply that the latter can be used as 
a definition of independence. 


EXAMPLE 2.34 It is known that 30% of a certain company’s washing machines require service while 
under warranty, whereas only 10% of its dryers need such service. If someone pur- 
chases both a washer and a dryer made by this company, what is the probability that 
both machines will need warranty service? 

Let A denote the event that the washer needs service while under warranty, 
and let B be defined analogously for the dryer. Then P(A) = .30 and P(B) = .10. 
Assuming that the two machines will function independently of one another, the 
desired probability is 


P(A 1 B) = P(A) - P(B) = (.30)(.10) = .03 a 


It is straightforward to show that A and B are independent iff A’ and B are inde- 
pendent, iff A and B’ are independent, and iff A’and B’ are independent. Thus in 
Example 2.34, the probability that neither machine needs service is 


P(A' MB’) = P(A’) - P(B’) = (.70)(.90) = .63 


EXAMPLE 2.35 Each day, Monday through Friday, a batch of components sent by a first supplier 
arrives at a certain inspection facility. Two days a week, a batch also arrives from a 
second supplier. Eighty percent of all supplier 1’s batches pass inspection, and 90% 
of supplier 2’s do likewise. What is the probability that, on a randomly selected day, 
two batches pass inspection? We will answer this assuming that on days when two 
batches are tested, whether the first batch passes is independent of whether the sec- 
ond batch does so. Figure 2.13 displays the relevant information. 


+——__ 4X (8 x 9) 


Figure 2.13 Tree diagram for Example 2.35 


P(two pass) = P(two received M both pass) 
= P(both pass | two received) - P(two received) 
= [(.8)(.9)](.4) = .288 a 


Independence of More Than Two Events 


The notion of independence of two events can be generalized to collections of more 
than two events. Although it is possible to extend the definition for two independent 
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events by working in terms of conditional and unconditional probabilities, it is more 
direct and less cumbersome to proceed along the lines of the last proposition. 


DEFINITION Events A,,..., A, are mutually independent if for every k (k = 2, 3,...,n) and 
every subset of indices i,, i,,..., i,, 


P(A, NA, NM... NA,) = P(A,) > PA,) P(A,) 


To paraphrase the definition, the events are mutually independent if the probability 
of the intersection of any subset of the n events is equal to the product of the individual 
probabilities. In using the multiplication property for more than two independent events, 
it is legitimate to replace one or more of the A;’s by their complements (e.g., if A,, A), 
and A, are independent events, so are Aj, Aj, and A). As was the case with two events, 
we frequently specify at the outset of a problem the independence of certain events. The 
probability of an intersection can then be calculated via multiplication. 


EXAMPLE 2.36 The article “Reliability Evaluation of Solar Photovoltaic Arrays” (Solar Energy, 
2002: 129-141) presents various configurations of solar photovoltaic arrays consisting 
of crystalline silicon solar cells. Consider first the system illustrated in Figure 2.14(a). 


1 2 3 1 2 3 
4 5 6 4 5 6 
(a) (b) 


Figure 2.14 System configurations for Example 2.36: (a) series-parallel; (b) total-cross-tied 


There are two subsystems connected in parallel, each one containing three cells. In 
order for the system to function, at least one of the two parallel subsystems must 
work. Within each subsystem, the three cells are connected in series, so a subsystem 
will work only if all cells in the subsystem work. Consider a particular lifetime value 
ty, and supose we want to determine the probability that the system lifetime exceeds 
ty. Let A; denote the event that the lifetime of cell i exceeds 4,(i = 1, 2,..., 6). 
We assume that the A‘s are independent events (whether any particular cell lasts 
more than f) hours has no bearing on whether or not any other cell does) and that 
P(A,) = .9 for every i since the cells are identical. Then 
P(system lifetime exceeds f,) = P[(A, M A, M A) U (Ay N A; M Ag)] 
= P(A, 1A, MA) + P(AyN As 1M Ao) 
— P[(A, NA, MA) M (Ay NA; M Ag)] 
= (.9)(.9)(.9) + (.9)(.9)(.9) — (.9)(.9)(.9)(.9)(.9)(.9) = .927 
Alternatively, 


P(system lifetime exceeds f,) = 1 — P(both subsystem lives are = fy) 
= 1 — [P(subsystem life is = 1))]? 
= 1 —[1 — P(subsystem life is > t)) 
=1-[1 — (99 = .927 


Next consider the total-cross-tied system shown in Figure 2.14(b), obtained from the 
series-parallel array by connecting ties across each column of junctions. Now the 
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system fails as soon as an entire column fails, and system lifetime exceeds f, only if 
the life of every column does so. For this configuration, 


P(system lifetime is at least t)) = [P(column lifetime exceeds f,)]* 


[1 — P(column lifetime is < t))}° 


[1 — P(both cells in a column have lifetime S 1))|° 


[1 — (1 — .9)?}3 = .970 
a 


EXERCISES Section 2.5 (70-89) 


70. 


71. 


72. 


73. 


74. 


75. 


Reconsider the credit card scenario of Exercise 47 
(Section 2.4), and show that A and B are dependent first 
by using the definition of independence and then by 
verifying that the multiplication property does not hold. 


An oil exploration company currently has two active 

projects, one in Asia and the other in Europe. Let A 

be the event that the Asian project is successful and B be 

the event that the European project is successful. Suppose 
that A and B are independent events with P(A) = .4 and 

P(B) =.7. 

a. If the Asian project is not successful, what is the 
probability that the European project is also not suc- 
cessful? Explain your reasoning. 

b. What is the probability that at least one of the two 
projects will be successful? 

c. Given that at least one of the two projects is success- 
ful, what is the probability that only the Asian project 
is successful? 


In Exercise 13, is any A; independent of any other A;? 
Answer using the multiplication property for indepen- 
dent events. 


If A and B are independent events, show that A’ and B are 
also independent. [Hint: First establish a relationship 
between P(A’  B), P(B), and P(A NM B).] 


The proportions of blood phenotypes in the U.S. popula- 
tion are as follows: 


A B AB O 
40 «11 04 45 


Assuming that the phenotypes of two randomly selected 
individuals are independent of one another, what is the 
probability that both phenotypes are O? What is the 
probability that the phenotypes of two randomly selected 
individuals match? 


One of the assumptions underlying the theory of control 
charting (see Chapter 16) is that successive plotted 
points are independent of one another. Each plotted point 
can signal either that a manufacturing process is operat- 
ing correctly or that there is some sort of malfunction. 
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Even when a process is running correctly, there is a 
small probability that a particular point will signal a 
problem with the process. Suppose that this probability 
is .05. What is the probability that at least one of 10 
successive points indicates a problem when in fact the 
process is operating correctly? Answer this question for 
25 successive points. 


In October, 1994, a flaw in a certain Pentium chip 
installed in computers was discovered that could result in 
a wrong answer when performing a division. The manu- 
facturer initially claimed that the chance of any particular 
division being incorrect was only 1 in 9 billion, so that it 
would take thousands of years before a typical user 
encountered a mistake. However, statisticians are not 
typical users; some modern statistical techniques are so 
computationally intensive that a billion divisions over a 
short time period is not outside the realm of possibility. 
Assuming that the 1 in 9 billion figure is correct and that 
results of different divisions are independent of one 
another, what is the probability that at least one error 
occurs in one billion divisions with this chip? 


An aircraft seam requires 25 rivets. The seam will have 

to be reworked if any of these rivets is defective. Suppose 

rivets are defective independently of one another, each 

with the same probability. 

a. If 15% of all seams need reworking, what is the 
probability that a rivet is defective? 

b. How small should the probability of a defective rivet be 
to ensure that only 10% of all seams need reworking? 


A boiler has five identical relief valves. The probability 
that any particular valve will open on demand is .96. 
Assuming independent operation of the valves, calculate 
P(at least one valve opens) and P(at least one valve fails 
to open). 


Two pumps connected in parallel fail independently of 
one another on any given day. The probability that only 
the older pump will fail is .10, and the probability that 
only the newer pump will fail is .05. What is the proba- 
bility that the pumping system will fail on any given day 
(which happens if both pumps fail)? 
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Consider the system of components connected as in the 
accompanying picture. Components | and 2 are connected 
in parallel, so that subsystem works iff either 1 or 2 works; 
since 3 and 4 are connected in series, that subsystem works 
iff both 3 and 4 work. If components work independently 
of one another and P(component i works) = .9 for i = 1,2 
and = .8 for i = 3,4, calculate P(system works). 


3 [4 


Refer back to the series-parallel system configuration 
introduced in Example 2.36, and suppose that there are 
only two cells rather than three in each parallel subsystem 
[in Figure 2.14(a), eliminate cells 3 and 6, and renumber 
cells 4 and 5 as 3 and 4]. Using P(A;) = .9, the probability 
that system lifetime exceeds fj is easily seen to be .9639. 
To what value would .9 have to be changed in order to 
increase the system lifetime reliability from .9639 to .99? 
(Hint: Let P(A,) = p, express system reliability in terms 
of p, and then let x = p’.] 


Consider independently rolling two fair dice, one red and 
the other green. Let A be the event that the red die shows 
3 dots, B be the event that the green die shows 4 dots, and 
C be the event that the total number of dots showing on 
the two dice is 7. Are these events pairwise independent 
(i.e., are A and B independent events, are A and C inde- 
pendent, and are B and C independent)? Are the three 
events mutually independent? 


Components arriving at a distributor are checked for 
defects by two different inspectors (each component is 
checked by both inspectors). The first inspector detects 
90% of all defectives that are present, and the second 
inspector does likewise. At least one inspector does not 
detect a defect on 20% of all defective components. 
What is the probability that the following occur? 
a. A defective component will be detected only by the 
first inspector? By exactly one of the two inspectors? 
b. All three defective components in a batch escape 
detection by both inspectors (assuming inspections 
of different components are independent of one 
another)? 


Consider purchasing a system of audio components con- 
sisting of a receiver, a pair of speakers, and a CD player. 
Let A, be the event that the receiver functions properly 
throughout the warranty period, A, be the event that the 
speakers function properly throughout the warranty 
period, and A, be the event that the CD player functions 
properly throughout the warranty period. Suppose that 
these events are (mutually) independent with P(A,) = 
95, P(A) = .98, and P(A,) = .80. 
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a. What is the probability that all three components 
function properly throughout the warranty period? 

b. What is the probability that at least one component 
needs service during the warranty period? 

c. What is the probability that all three components 
need service during the warranty period? 

d. What is the probability that only the receiver needs 
service during the warranty period? 

e. What is the probability that exactly one of the three 
components needs service during the warranty 
period? 

f. What is the probability that all three components 
function properly throughout the warranty period but 
that at least one fails within a month after the war- 
ranty expires? 


A quality control inspector is examining newly produced 
items for faults. The inspector searches an item for faults in 
a series of independent fixations, each of a fixed duration. 
Given that a flaw is actually present, let p denote the prob- 
ability that the flaw is detected during any one fixation (this 
model is discussed in “Human Performance in Sampling 
Inspection,’ Human Factors, 1979: 99-105). 

a. Assuming that an item has a flaw, what is the proba- 
bility that it is detected by the end of the second 
fixation (once a flaw has been detected, the sequence 
of fixations terminates)? 

b. Give an expression for the probability that a flaw will 
be detected by the end of the nth fixation. 

c. If when a flaw has not been detected in three fixa- 
tions, the item is passed, what is the probability that 
a flawed item will pass inspection? 

d. Suppose 10% of all items contain a flaw [P(randomly 
chosen item is flawed) = .1]. With the assumption of 
part (c), what is the probability that a randomly chosen 
item will pass inspection (it will automatically pass if 
it is not flawed, but could also pass if it is flawed)? 

e. Given that an item has passed inspection (no flaws in 
three fixations), what is the probability that it is actu- 
ally flawed? Calculate for p = .5. 


a. A lumber company has just taken delivery on a ship- 
ment of 10,000 2 X 4 boards. Suppose that 20% of 
these boards (2000) are actually too green to be used in 
first-quality construction. Two boards are selected at 
random, one after the other. Let A = {the first board is 
green} and B = {the second board is green}. Com- 
pute P(A), P(B), and P(A M B) (a tree diagram might 
help). Are A and B independent? 

b. With A and B independent and P(A) = P(B) = .2, 
what is P(A MB)? How much difference is there 
between this answer and P(A /M B) in part (a)? For 
purposes of calculating P(A  B), can we assume 
that A and B of part (a) are independent to obtain 
essentially the correct probability? 

c. Suppose the shipment consists of ten boards, of which 
two are green. Does the assumption of independence 
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now yield approximately the correct answer for 
P(A M B)? What is the critical difference between the 
situation here and that of part (a)? When do you think 
an independence assumption would be valid in obtain- 
ing an approximately correct answer to P(A M B)? 


Consider randomly selecting a single individual and 
having that person test drive 3 different vehicles. Define 
events A,, A,, and A, by 
A, = likes vehicle #1 
A, = likes vehicle #3 
Suppose that P(A,) = .55, P(A,) = .65, P(A3) = .70, 
P(A, UA,) = .80, P(A, 1. A;) = .40, and 
P(A, UA, UA) = .88. 
a. What is the probability that the individual likes both 
vehicle #1 and vehicle #2? 
Determine and interpret P(A,| A;). 


A, = likes vehicle #2 


c. Are A, and A, independent events? Answer in two 
different ways. 

d. If you learn that the individual did not like vehicle 
#1, what now is the probability that he/she liked at 
least one of the other two vehicles? 
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The probability that an individual randomly selected 
from a particular population has a certain disease is .05. 
A diagnostic test correctly detects the presence of the 
disease 98% of the time and correctly detects the absence 
of the disease 99% of the time. If the test is applied 
twice, the two test results are independent, and both are 
positive, what is the (posterior) probability that the 
selected individual has the disease? [Hint: Tree diagram 
with first-generation branches corresponding to Disease 
and No Disease, and second- and _ third-generation 
branches corresponding to results of the two tests.] 


Suppose identical tags are placed on both the left ear 
and the right ear of a fox. The fox is then let loose for a 
period of time. Consider the two_ events 
C, = {left ear tag is lost} andC, = {right ear tag is lost}. 
Let 7 = P(C,) = P(C,), and assume C, and C, are inde- 
pendent events. Derive an expression (involving 7) for 
the probability that exactly one tag is lost, given that at 
most one is lost (“Ear Tag Loss in Red Foxes,” J. 
Wildlife Mgmt., 1976: 164-167). [Hint: Draw a tree 
diagram in which the two initial branches refer to 
whether the left ear tag was lost.] 


SUPPLEMENTARY EXERCISES (90-114) 
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A certain legislative committee consists of 10 senators. A 

subcommittee of 3 senators is to be randomly selected. 

a. How many different such subcommittees are there? 

b. If the senators are ranked 1, 2,..., 10 in order of 
seniority, how many different subcommittees would 
include the most senior senator? 

c. What is the probability that the selected subcommit- 
tee has at least 1 of the 5 most senior senators? 

d. What is the probability that the subcommittee 
includes neither of the two most senior senators? 


A factory uses three production lines to manufacture cans 
of a certain type. The accompanying table gives percent- 
ages of nonconforming cans, categorized by type of non- 
conformance, for each of the three lines during a particular 
time period. 


Line 1 Line 2 Line 3 
Blemish 15 12 20 
Crack 50 44 40 
Pull-Tab Problem 21 28 24 
Surface Defect 10 8 15 
Other 4 8 2, 


During this period, line 1 produced 500 nonconform- 
ing cans, line 2 produced 400 such cans, and line 3 was 
responsible for 600 nonconforming cans. Suppose that 
one of these 1500 cans is randomly selected. 
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a. What is the probability that the can was produced by 
line 1? That the reason for nonconformance is a 
crack? 

b. If the selected can came from line 1, what is the 
probability that it had a blemish? 

c. Given that the selected can had a surface defect, what 
is the probability that it came from line 1? 


An employee of the records office at a certain university 
currently has ten forms on his desk awaiting processing. 
Six of these are withdrawal petitions and the other four 
are course substitution requests. 

a. If he randomly selects six of these forms to give to a 
subordinate, what is the probability that only one of 
the two types of forms remains on his desk? 

b. Suppose he has time to process only four of these 
forms before leaving for the day. If these four are 
randomly selected one by one, what is the probability 
that each succeeding form is of a different type from 
its predecessor? 


One satellite is scheduled to be launched from Cape 
Canaveral in Florida, and another launching is scheduled 
for Vandenberg Air Force Base in California. Let A denote 
the event that the Vandenberg launch goes off on schedule, 
and let B represent the event that the Cape Canaveral 
launch goes off on schedule. If A and B are independent 
events with P(A) >P(B), P(A U B)=.626, and 
P(A  B) = .144, determine the values of P(A) and P(B). 
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A transmitter is sending a message by using a binary code, 
namely, a sequence of 0’s and 1’s. Each transmitted bit (0 
or 1) must pass through three relays to reach the receiver. 
At each relay, the probability is .20 that the bit sent will be 
different from the bit received (a reversal). Assume that 
the relays operate independently of one another. 


Transmitter — Relay 1 — Relay 2 — Relay 3 — Receiver 
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a. Ifa 1 is sent from the transmitter, what is the proba- 
bility that a 1 is sent by all three relays? 

b. Ifa 1 is sent from the transmitter, what is the proba- 
bility that a 1 is received by the receiver? [Hint: The 
eight experimental outcomes can be displayed on a 
tree diagram with three generations of branches, one 
generation for each relay.] 

c. Suppose 70% of all bits sent from the transmitter are 
1s. If a 1 is received by the receiver, what is the prob- 
ability that a 1 was sent? 


Individual A has a circle of five close friends (B, C, D, 

E, and F). A has heard a certain rumor from outside the 

circle and has invited the five friends to a party to circu- 

late the rumor. To begin, A selects one of the five at 
random and tells the rumor to the chosen individual. 

That individual then selects at random one of the four 

remaining individuals and repeats the rumor. Continuing, 

a new individual is selected from those not already hav- 

ing heard the rumor by the individual who has just heard 

it, until everyone has been told. 

a. What is the probability that the rumor is repeated in 
the order B, C, D, E, and F? 

b. What is the probability that F is the third person at 
the party to be told the rumor? 

c. What is the probability that F is the last person to 
hear the rumor? 

d. If at each stage the person who currently “has” the 
rumor does not know who has already heard it and 
selects the next recipient at random from all five 
possible individuals, what is the probability that F 
has still not heard the rumor after it has been told ten 
times at the party? 


According to the article “Optimization of Distribution 
Parameters for Estimating Probability of Crack 
Detection” (J. of Aircraft, 2009: 2090-2097), the fol- 
lowing “Palmberg” equation is commonly used to deter- 
mine the probability P(c) of detecting a crack of size c 
in an aircraft structure: 


icje"y 
1 + (c/c*)P 
where c* is the crack size that corresponds to a .5 detec- 
tion probability (and thus is an assessment of the quality 
of the inspection process). 
a. Verify that P,(c*) = .5 
b. What is P,(2c*) when B = 4? 
c. Suppose an inspector inspects two different panels, 
one with a crack size of c* and the other with a crack 


Filo) = 
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size of 2c*. Again assuming 8 = 4 and also that the 
results of the two inspections are independent of one 
another, what is the probability that exactly one of 
the two cracks will be detected? 

d. What happens to P;(c) as B — ©? 


A chemical engineer is interested in determining whether a 
certain trace impurity is present in a product. An experiment 
has a probability of .80 of detecting the impurity if it is 
present. The probability of not detecting the impurity if it is 
absent is .90. The prior probabilities of the impurity being 
present and being absent are .40 and .60, respectively. Three 
separate experiments result in only two detections. What is 
the posterior probability that the impurity is present? 


Five friends—Allison, Beth, Carol, Diane, and Evelyn— 
have identical calculators and are studying for a statistics 
exam. They set their calculators down in a pile before taking 
a study break and then pick them up in random order when 
they return from the break. What is the probability that at 
least one of the five gets her own calculator? [Hint: Let A 
be the event that Alice gets her own calculator, and define 
events B, C, D, and E analogously for the other four stu- 
dents.] How can the event {at least one gets her own calcu- 
lator} be expressed in terms of the five events A, B, C, D, 
and E? Now use a general law of probability. [Note: This is 
called the matching problem. Its solution is easily extended 
to n individuals. Can you recognize the result when n is 
large (the approximation to the resulting series)?] 


Fasteners used in aircraft manufacturing are slightly 
crimped so that they lock enough to avoid loosening 
during vibration. Suppose that 95% of all fasteners pass 
an initial inspection. Of the 5% that fail, 20% are so 
seriously defective that they must be scrapped. The 
remaining fasteners are sent to a recrimping operation, 
where 40% cannot be salvaged and are discarded. The 
other 60% of these fasteners are corrected by the 
recrimping process and subsequently pass inspection. 

a. What is the probability that a randomly selected 
incoming fastener will pass inspection either initially 
or after recrimping? 

b. Given that a fastener passed inspection, what is the 
probability that it passed the initial inspection and 
did not need recrimping? 


Jay and Maurice are playing a tennis match. In one par- 

ticular game, they have reached deuce, which means each 

player has won three points. To finish the game, one of 
the two players must get two points ahead of the other. 

For example, Jay will win if he wins the next two points 

(JJ), or if Maurice wins the next point and Jay the three 

points after that (MJJJ), or if the result of the next six 

points is JMMJJJ, and so on. 

a. Suppose that the probability of Jay winning a point 
is .6 and outcomes of successive points are inde- 
pendent of one another. What is the probability that 
Jay wins the game? [Hint: In the law of total prob- 
ability, let A, = Jay wins each of the next two 
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points, A, = Maurice wins each of the next two 
points, and A, = each player wins one of the next 
two points. Also let p = P(Jay wins the game). 
How does p compare to P(Jay wins the game | A;)?] 

b. If Jay wins the game, what is the probability that he 
needed only two points to do so? 


A system consists of two components. The probability that 
the second component functions in a satisfactory manner 
during its design life is .9, the probability that at least one of 
the two components does so is .96, and the probability that 
both components do so is .75. Given that the first component 
functions in a satisfactory manner throughout its design life, 
what is the probability that the second one does also? 


The accompanying table categorizing each student in a 
sample according to gender and eye color appeared in the 
article “Does Eye Color Depend on Gender? It Might 
Depend on Who or How You Ask” (J. of Statistics 
Educ., 2013, Vol. 21, Num. 2). 


Gender 


Eye Color 


Blue Brown Green Hazel Total 


Male 
Female 
Total 


370 
359 
729 


352 
290 
642 


198 
110 
308 


187 
160 
347 


1107 
919 
2026 
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Suppose that one of these 2026 students is randomly 

selected. Let F denote the event that the selected individual 

is a female, and A, B, C, and D represent the events that he 

or she has blue, brown, green, and hazel eyes, respectively. 

a. Calculate both P(F) and P(C). 

b. Calculate PF M C). Are the events F and C indepen- 
dent? Why or why not? 

c. If the selected individual has green eyes, what is the 
probability that he or she is a female? 

d. If the selected individual is female, what is the prob- 
ability that she has green eyes? 

e. What is the “conditional distribution” of eye color for 
females (i.e., P(A|F), P(BIF), P(CIF), and P(D| F)), and 
what is it for males? Compare the two distributions. 


a. A certain company sends 40% of its overnight mail 
parcels via express mail service E,. Of these parcels, 
2% arrive after the guaranteed delivery time (denote 
the event “late delivery” by L). If a record of an 
overnight mailing is randomly selected from the 
company’s file, what is the probability that the parcel 
went via E, and was late? 

b. Suppose that 50% of the overnight parcels are sent via 
express mail service E, and the remaining 10% are 
sent via E;. Of those sent via E,, only 1% arrive late, 
whereas 5% of the parcels handled by E, arrive late. 
What is the probability that a randomly selected 
parcel arrived late? 

c. If a randomly selected parcel has arrived on time, 
what is the probability that it was not sent via E,? 
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A company uses three different assembly lines—A,, A,, 
and A,—to manufacture a particular component. Of those 
manufactured by line A,, 5% need rework to remedy a 
defect, whereas 8% of A,’s components need rework and 
10% of A,’s need rework. Suppose that 50% of all com- 
ponents are produced by line A,, 30% are produced by 
line A,, and 20% come from line A,. If a randomly 
selected component needs rework, what is the probability 
that it came from line A,? From line A,? From line A,? 


Disregarding the possibility of a February 29 birthday, 
suppose a randomly selected individual is equally likely 
to have been born on any one of the other 365 days. 

a. If ten people are randomly selected, what is the prob- 
ability that all have different birthdays? That at 
least two have the same birthday? 

b. With k replacing ten in part (a), what is the smallest 
k for which there is at least a 50-50 chance that two 
or more people will have the same birthday? 

c. If ten people are randomly selected, what is the prob- 
ability that either at least two have the same birthday 
or at least two have the same last three digits of their 
Social Security numbers? [Note: The article “Methods 
for Studying Coincidences” (F. Mosteller and 
P. Diaconis, J. Amer. Stat. Assoc., 1989: 853-861) 
discusses problems of this type.] 


One method used to distinguish between granitic (G) and 
basaltic (B) rocks is to examine a portion of the infrared 
spectrum of the sun’s energy reflected from the rock 
surface. Let R,, Rj, and R,; denote measured spectrum 
intensities at three different wavelengths; typically, for 
granite R, < R, < R,, whereas for basalt R, << R,; < R). 
When measurements are made remotely (using aircraft), 
various orderings of the R,s may arise whether the rock is 
basalt or granite. Flights over regions of known composi- 
tion have yielded the following information: 


Granite Basalt 
R,<R, <R, 60% 10% 
R, <R,<R, 25% 20% 
R,<R, <R, 15% 10% 


Suppose that for a randomly selected rock in a certain 

region, P(granite) = .25 and P(basalt) = .75. 

a. Show that P(granite | R, < R,< R) > P(basalt | 
R, <R, < R;). Ifmeasurements yielded RR, < R, < R3, 
would you classify the rock as granite or basalt? 

b. If measurements yielded R, < R, < R,, how would 
you classify the rock? Answer the same question for 
R,<R, <R,. 

c. Using the classification rules indicated in parts (a) 
and (b), when selecting a rock from this region, what 
is the probability of an erroneous classification? 
[Hint: Either G could be classified as B or B as G, 
and P(B) and P(G) are known.] 
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d. If P(granite) = p rather than .25, are there values of 
Pp (other than 1) for which one would always classify 
a rock as granite? 


A subject is allowed a sequence of glimpses to detect a 
target. Let G, = {the target is detected on the ith glimpse}, 
with p, = P(G,). Suppose the G/s are independent events, 
and write an expression for the probability that the target 
has been detected by the end of the nth glimpse. [Note: 
This model is discussed in “Predicting Aircraft 
Detectability,’ Human Factors, 1979: 277-291.] 


In a Little League baseball game, team A’s pitcher throws 
a strike 50% of the time and a ball 50% of the time, suc- 
cessive pitches are independent of one another, and the 
pitcher never hits a batter. Knowing this, team B’s man- 
ager has instructed the first batter not to swing at any- 
thing. Calculate the probability that 


a. The batter walks on the fourth pitch 


b. The batter walks on the sixth pitch (so two of the first 
five must be strikes), using a counting argument or 
constructing a tree diagram 


c. The batter walks 


The first batter up scores while no one is out (assum- 
ing that each batter pursues a no-swing strategy) 


Four engineers, A, B, C, and D, have been scheduled for 
job interviews at 10 A.M. on Friday, January 13, at Random 
Sampling, Inc. The personnel manager has scheduled the 
four for interview rooms 1, 2, 3, and 4, respectively. 
However, the manager’s secretary does not know this, so 
assigns them to the four rooms in a completely random 
fashion (what else!). What is the probability that 

a. All four end up in the correct rooms? 

b. None of the four ends up in the correct room? 


A particular airline has 10 a.m. flights from Chicago to 
New York, Atlanta, and Los Angeles. Let A denote the 
event that the New York flight is full and define events B 
and C analogously for the other two flights. Suppose 
P(A) = .9, P(B) = .7, P(C) = .8 and the three events are 
independent. What is the probability that 

a. All three flights are full? That at least one flight is 

not full? 
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b. Only the New York flight is full? That exactly one of 
the three flights is full? 


A personnel manager is to interview four candidates for 
a job. These are ranked 1, 2, 3, and 4 in order of prefer- 
ence and will be interviewed in random order. However, 
at the conclusion of each interview, the manager will 
know only how the current candidate compares to those 
previously interviewed. For example, the interview order 
3, 4, 1, 2 generates no information after the first inter- 
view, shows that the second candidate is worse than the 
first, and that the third is better than the first two. 
However, the order 3, 4, 2, 1 would generate the same 
information after each of the first three interviews. The 
manager wants to hire the best candidate but must make 
an irrevocable hire/no hire decision after each interview. 
Consider the following strategy: Automatically reject the 
first s candidates and then hire the first subsequent candi- 
date who is best among those already interviewed (if no 
such candidate appears, the last one interviewed is hired). 

For example, with s = 2, the order 3, 4, 1, 2 would 
result in the best being hired, whereas the order 3, 1, 2, 4 
would not. Of the four possible s values (0, 1, 2, and 3), 
which one maximizes P(best is hired)? [Hint: Write out 
the 24 equally likely interview orderings: s = 0 means 
that the first candidate is automatically hired.] 


Consider four independent events A,, A,, A;, and A,, and 
let p, = P(A;) for i = 1,2,3,4. Express the probability that 
at least one of these four events occurs in terms of the p;s, 
and do the same for the probability that at least two of the 
events occur. 


A box contains the following four slips of paper, each 
having exactly the same dimensions: (1) win prize 1; (2) 
win prize 2; (3) win prize 3; (4) win prizes 1, 2, and 3. 
Oneslip willberandomlyselected.LetA, = {win prize 1}, 
A, = {win prize 2}, and A, = {win prize 3}. Show that 
A, and A, are independent, that A, and A, are indepen- 
dent, and that A, and A, are also independent (this is 
pairwise independence). However, show _ that 
P(A, NA, MA;) #P(A,) - P(A,) - P(A), so the three 
events are not mutually independent. 


Show that if A,, Aj, and A, are independent events, then 
P(A, | A, NA;) = P(A)). 
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Discrete Random 


Variables and Probability 
Distnbutions 


INTRODUCTION 


Whether an experiment yields qualitative or quantitative outcomes, methods of 
Statistical analysis require that we focus on certain numerical aspects of the 
data (such as a sample proportion x/n, mean x, or standard deviation s). The 
concept of a random variable allows us to pass from the experimental out- 
comes themselves to a numerical function of the outcomes. There are two fun- 
damentally different types of random variables—discrete random variables and 
continuous random variables. In this chapter, we examine the basic properties 
and discuss the most important examples of discrete variables. Chapter 4 focuses 
on continuous random variables. 
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3.1 Random Variables 


In any experiment, there are numerous characteristics that can be observed or mea- 
sured, but in most cases an experimenter will focus on some specific aspect or 
aspects of a sample. For example, in a study of commuting patterns in a metropoli- 
tan area, each individual in a sample might be asked about commuting distance and 
the number of people commuting in the same vehicle, but not about IQ, income, fam- 
ily size, and other such characteristics. Alternatively, a researcher may test a sample 
of components and record only the number that have failed within 1000 hours, rather 
than record the individual failure times. 

In general, each outcome of an experiment can be associated with a number by 
specifying a rule of association (e.g., the number among the sample of ten compo- 
nents that fail to last 1000 hours or the total weight of baggage for a sample of 25 air- 
line passengers). Such a rule of association is called a random variable—a variable 
because different numerical values are possible and random because the observed 
value depends on which of the possible experimental outcomes results (Figure 3.1). 


Figure 3.1. A random variable 


DEFINITION For a given sample space £ of some experiment, a random variable (rv) is 
any rule that associates a number with each outcome in £. In mathematical 
language, a random variable is a function whose domain is the sample space 
and whose range is the set of real numbers. 


Random variables are customarily denoted by uppercase letters, such as X and 
Y, near the end of our alphabet. In contrast to our previous use of a lowercase letter, 
such as x, to denote a variable, we will now use lowercase letters to represent some 
particular value of the corresponding random variable. The notation X(w) = x means 
that x is the value associated with the outcome w by the rv X. 


EXAMPLE 3.1 When a student calls a university help desk for technical support, he/she will either 
immediately be able to speak to someone (S, for success) or will be placed on hold 
(F, for failure). With § = {S, F}, define an rv X by 


X(S)=1 X(F)=0 


The rv X indicates whether (1) or not (0) the student can immediately speak to 
someone. | 


The rv X in Example 3.1 was specified by explicitly listing each element of £ 
and the associated number. Such a listing is tedious if £ contains more than a few 
outcomes, but it can frequently be avoided. 


EXAMPLE 3.2 Consider the experiment in which a telephone number in a certain area code is dialed 
using a random number dialer (such devices are used extensively by polling organi- 
zations), and define an rv Y by 
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_J1_— if the selected number is unlisted 
0 if the selected number is listed in the directory 


For example, if 5282966 appears in the telephone directory, then Y(5282966) = 0, 
whereas Y(7727350) = 1 tells us that the number 7727350 is unlisted. A word 
description of this sort is more economical than a complete listing, so we will use 
such a description whenever possible. a 


In Examples 3.1 and 3.2, the only possible values of the random variable were 
0 and 1. Such a random variable arises frequently enough to be given a special name, 
after the individual who first studied it. 


DEFINITION Any random variable whose only possible values are 0 and 1 is called a 
Bernoulli random variable. 


We will sometimes want to consider several different random variables from 
the same sample space. 


EXAMPLE 3.3. Example 2.3 described an experiment in which the number of pumps in use at each 
of two six-pump gas stations was determined. Define rv’s X, Y, and U by 
X = the total number of pumps in use at the two stations 


Y = the difference between the number of pumps in use at station | 
and the number in use at station 2 


U = the maximum of the numbers of pumps in use at the two stations 
If this experiment is performed and w = (2, 3) results, then X((2, 3)) = 2 + 3 = 5, so 


we say that the observed value of X was x = 5. Similarly, the observed value of Y would 
be y = 2 — 3 = —1, and the observed value of U would be u = max (2,3) =3. 


Each of the random variables of Examples 3.1—3.3 can assume only a finite 
number of possible values. This need not be the case. 


EXAMPLE 3.4 Consider an experiment in which 9-volt batteries are tested until one with an acceptable 
voltage (S) is obtained. The sample space is £ = {S, FS, FFS,...}. Define an rv X by 


X = the number of batteries tested before the experiment terminates 


Then X(S) = 1, X(FS) = 2, X(FFS) = 3,..., X(FFFFFFS) = 7, and so on. Any 
positive integer is a possible value of X, so the set of possible values is infinite. Mf 


EXAMPLE 3.5 Suppose that in some random fashion, a location (latitude and longitude) in the con- 
tinental United States is selected. Define an rv Y by 


Y = the height above sea level at the selected location 


For example, if the selected location were (39°50’N, 98°35’W), then we might have 
Y((39°50'N, 98°35’ W)) = 1748.26 ft. The largest possible value of Y is 14,494 
(Mt. Whitney), and the smallest possible value is —282 (Death Valley). The set of all 
possible values of Y is the set of all numbers in the interval between —282 and 14,494— 
that is, 


:y is anumber, —282 =y = 14,494 
yiy ) 


and there are an infinite number of numbers in this interval. A 
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Two Types of Random Variables 


In Section 1.2, we distinguished between data resulting from observations on a count- 
ing variable and data obtained by observing values of a measurement variable. A 
slightly more formal distinction characterizes two different types of random variables. 


DEFINITION A discrete random variable is an rv whose possible values either constitute a 
finite set or else can be listed in an infinite sequence in which there is a first 
element, a second element, and so on (“countably” infinite). 

A random variable is continuous if both of the following apply: 


1. Its set of possible values consists either of all numbers in a single interval 
on the number line (possibly infinite in extent, e.g., from — to ©) or all 
numbers in a disjoint union of such intervals (e.g., [0, 10] U [20, 30]). 


2. No possible value of the variable has positive probability, that is, 
P(X = c) = 0 for any possible value c. 


Although any interval on the number line contains an infinite number of numbers, it 
can be shown that there is no way to create an infinite listing of all these values— 
there are just too many of them. The second condition describing a continuous 
random variable is perhaps counterintuitive, since it would seem to imply a total 
probability of zero for all possible values. But we shall see in Chapter 4 that inter- 
vals of values have positive probability; the probability of an interval will decrease 
to zero as the width of the interval shrinks to zero. 


EXAMPLE 3.6 All random variables in Examples 3.1—3.4 are discrete. As another example, suppose 
we select married couples at random and do a blood test on each person until we find 
a husband and wife who both have the same Rh factor. With X = the number of blood 
tests to be performed, possible values of X are D = {2, 4, 6, 8,... }. Since the possible 
values have been listed in sequence, X is a discrete rv. | 


To study basic properties of discrete rv’s, only the tools of discrete mathematics— 
summation and differences—are required. The study of continuous variables requires 
the continuous mathematics of the calculus—integrals and derivatives. 


EXERCISES Section 3.1 (1-10) 


1. A concrete beam may fail either by shear (S) or flex- What are the possible values of X? Give three possible 
ure (F). Suppose that three failed beams are randomly outcomes and their associated X values. 
selected and the type of failure is determined for each 
one. Let X = the number of beams among the three 
selected that failed by shear. List each outcome in the 
sample space along with the associated value of X. 


5. If the sample space # is an infinite set, does this 
necessarily imply that any rv X defined from # will 
have an infinite set of possible values? If yes, say why. 


If no, give an example. 
2. Give three examples of Bernoulli rv’s (other than those 


in the text). 6. Starting at a fixed time, each car entering an intersection 


is observed to see whether it turns left (L), right (R), or 

goes straight ahead (A). The experiment terminates as 

soon as a Car is observed to turn left. Let X = the number 

4. Let X =the number of nonzero digits in a randomly of cars observed. What are possible X values? List five 
selected 4-digit PIN that has no restriction on the digits. outcomes and their associated X values. 


3. Using the experiment in Example 3.3, define two more 
random variables and list the possible values of each. 
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For each random variable defined here, describe the set 

of possible values for the variable, and state whether the 

variable is discrete. 

a. X =the number of unbroken eggs in a randomly 
chosen standard egg carton 

b. Y = the number of students on a class list for a partic- 
ular course who are absent on the first day of classes 

c. U =the number of times a duffer has to swing at a 
golf ball before hitting it 
X = the length of a randomly selected rattlesnake 

e. Z = the sales tax percentage for a randomly selected 
amazon.com purchase 

f. Y = the pH of a randomly chosen soil sample 

g. X = the tension (psi) at which a randomly selected 
tennis racket has been strung 

h. X = the total number of times three tennis players 
must spin their rackets to obtain something other than 
UUU or DDD (to determine which two play next) 


Each time a component is tested, the trial is a success (S) or 
failure (F). Suppose the component is tested repeatedly 
until a success occurs on three consecutive trials. Let Y 
denote the number of trials necessary to achieve this. List all 
outcomes corresponding to the five smallest possible values 
of Y and state which Y value is associated with each one. 


An individual named Claudius is located at the point 0 in 
the accompanying diagram. 


10. 
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Using an appropriate randomization device (such as a 
tetrahedral die, one having four sides), Claudius first 
moves to one of the four locations B,, B,, B;, B,. Once at 
one of these locations, another randomization device is 
used to decide whether Claudius next returns to 0 or next 
visits one of the other two adjacent points. This process 
then continues; after each move, another move to one 
of the (new) adjacent points is determined by tossing an 
appropriate die or coin. 

a. Let X = the number of moves that Claudius makes 
before first returning to 0. What are possible values 
of X? Is X discrete or continuous? 

b. If moves are allowed also along the diagonal paths 
connecting 0 to A,, A, A3, and A,, respectively, 
answer the questions in part (a). 

The number of pumps in use at both a six-pump station 

and a four-pump station will be determined. Give the pos- 

sible values for each of the following random variables: 

a. T = the total number of pumps in use 

b. X = the difference between the numbers in use at 
stations 1 and 2 

c. U =the maximum number of pumps in use at either 
station 

d. Z=the number of stations having exactly two 
pumps in use 


3.2 Probability Distributions 
for Discrete Random Variables 


Probabilities assigned to various outcomes in in turn determine probabilities asso- 
ciated with the values of any particular rv X. The probability distribution of X says 
how the total probability of 1 is distributed among (allocated to) the various possible 
X values. Suppose, for example, that a business has just purchased four laser printers, 
and let X be the number among these that require service during the warranty period. 
Possible X values are then 0, 1, 2, 3, and 4. The probability distribution will tell us 
how the probability of 1 is subdivided among these five possible values—how much 
probability is associated with the X value 0, how much is apportioned to the X value 1, 
and so on. We will use the following notation for the probabilities in the distribution: 


p(O) = the probability of the X value 0 = P(X = 0) 
pC.) = the probability of the X value 1 = P(X = 1) 


and so on. In general, p(x) will denote the probability assigned to the value x. 
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EXAMPLE 3.7 The Cal Poly Department of Statistics has a lab with six computers reserved for 
statistics majors. Let X denote the number of these computers that are in use at a 
particular time of day. Suppose that the probability distribution of X is as given in the 
following table; the first row of the table lists the possible X values and the second 
row gives the probability of each such value. 


x | 0 1 2 3 4 5 6 
p(x) | 05 10 15 25 20 15.10 


We can now use elementary probability properties to calculate other probabilities of 
interest. For example, the probability that at most 2 computers are in use is 


P(X S 2) = P(X = Oor 1 or 2) = p(O) + pC) + p(2) = .05 + .10 + .15 = .30 


Since the event at least 3 computers are in use is complementary to at most 2 com- 
puters are in use, 


P(X = 3) =1-— P(X <2) =1-.30=.70 


which can, of course, also be obtained by adding together probabilities for the values 
3, 4, 5, and 6. The probability that between 2 and 5 computers inclusive are in use is 


PQ=X=5) = PX = 2, 3,4, or 5) = 15 + 25 + 20 + 15 = .75 
whereas the probability that the number of computers in use is strictly between 2 and 5 is 


P(2<X <5) = P(X = 3 0r4) = 254+ .20 = 45 a 


DEFINITION The probability distribution or probability mass function (pmf) of a discrete rv 
is defined for every number x by p(x) = P(X = x) = P(allw € §: X(w) = x). 


In words, for every possible value x of the random variable, the pmf specifies 
the probability of observing that value when the experiment is performed. The con- 
ditions p(x) = 0 and 24) possibie x P(X) = 1 are required of any pmf. 

The pmf of X in the previous example was simply given in the problem 
description. We now consider several examples in which various probability proper- 
ties are exploited to obtain the desired distribution. 


EXAMPLE 3.8 Six boxes of components are ready to be shipped by a certain supplier. The number 
of defective components in each box is as follows: 


Box 12 3 4 5 6 
Number of defectives 02 0 1 2 0 


One of these boxes is to be randomly selected for shipment to a particular customer. 
Let X be the number of defectives in the selected box. The three possible X val- 
ues are 0, 1, and 2. Of the six equally likely simple events, three result in X = 0, one 
in X = 1, and the other two in X = 2. Then 


3 
p(O) = P(X = 0) = P(box | or 3 or 6 is sent) = 6 = .500 
1 
p(l) = P(X = 1) = P(box 4 is sent) = 6 = .167 
. 2 
p(2) = P(X = 2) = P(box 2 or 5 is sent) = 6 = 333 
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That is, a probability of .500 is distributed to the X value 0, a probability of .167 is placed 
on the X value 1, and the remaining probability, .333, is associated with the X value 2. 
The values of X along with their probabilities collectively specify the pmf. If this exper- 
iment were repeated over and over again, in the long run X = 0 would occur one-half of 
the time, X = 1 one-sixth of the time, and X = 2 one-third of the time. |_| 


EXAMPLE 3.9 Consider whether the next person buying a computer at a certain electronics store 
buys a laptop or a desktop model. Let 


x= F if the customer purchases a desktop computer 
0 if the customer purchases a laptop computer 
If 20% of all purchasers during that week select a desktop, the pmf for X is 
p(O) = P(X = 0) = P(next customer purchases a laptop model) = .8 
p(1) = P(X = 1) = P(next customer purchases a desktop model) = .2 
D(x) = P(X = x) = Oforx # Oor | 
An equivalent description is 
8 ifx =0 
p(x) = 3.2 ifx=1 
0 ifx ~Oorl 


Figure 3.2 is a picture of this pmf, called a line graph. X is, of course, a Bernoulli rv 
and p(x) is a Bernoulli pmf. 


P(x) 4 
1- 
| > xX 
0 i 
Figure 3.2 The line graph for the pmf in Example 3.9 ai} 


EXAMPLE 3.10 In a group of five potential blood donors—a, b, c, d, and e—only a and b have type 
O-positive blood. Five blood samples, one from each individual, will be typed in ran- 
dom order until an O+ individual is identified. Let the rv Y = the number of typings 
necessary to identify an O+ individual. Then the pmf of Y is 


2 
pU) = PY = 1) = P(@or b typed first) = 5 = 4 
p(2) = P(Y = 2) = Pc, d, or e first, and then a or b) 


z 


3 
= Pc, d, or e first) - P(a or b next | c, d, or e first) = 5 : 4 = 3 


p(3) = P(Y = 3) = Pc, d, or e first and second, and then a or b) 


3 
5 
p(y) =0 ify #1,2,3,4 
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In tabular form, the pmf is 


y | 1 2 3 + 
Py) | A 2B 2 oil 
where any y value not listed receives zero probability. Figure 3.3 shows a line graph 
of the pmf. 
Py) 
5 


Figure 3.3 The line graph for the pmf in Example 3.10 a 


The name “probability mass function” is suggested by a model used in physics 
for a system of “point masses.” In this model, masses are distributed at various loca- 
tions x along a one-dimensional axis. Our pmf describes how the total probability 
mass of | is distributed at various points along the axis of possible values of the 
random variable (where and how much mass at each x). 

Another useful pictorial representation of a pmf, called a probability histogram, 
is similar to histograms discussed in Chapter 1. Above each y with p(y) > 0, construct a 
rectangle centered at y. The height of each rectangle is proportional to p(y), and the base 
width is the same for all rectangles. When possible values are equally spaced, the base 
width is frequently chosen as the distance between successive y values (though it could 
be smaller). Figure 3.4 shows two probability histograms. 


(a) (b) 
Figure 3.4 Probability histograms: (a) Example 3.9; (b) Example 3.10 


It is often helpful to think of a pmf as specifying a mathematical model for a discrete 
population. 


EXAMPLE 3.11 Consider selecting a household in a certain region at random and let X = the number 
of individuals in the selected household. Suppose the pmf of X is as follows: 


4 | 1 2 38 4 © 6 F & 8 
px) | 140 175.220.260.155 025.015.005.004 .001 


[this is very close to the household size distribution for rural Thailand given in the arti- 
cle “The Probability of Containment for Multitype Branching Process Models for 
Emerging Epidemics” (J. of Applied Probability, 2011: 173-188), which modeled 
influenza transmission. | 

Suppose this is based on | million households. One way to view this situation 
is to think of the population as consisting of | million households, each with its own 
X value; the proportion with each X value is given by p(x) in the above table. An alter- 
native viewpoint is to forget about the households and think of the population itself 
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as consisting of X values: 14% of these values are 1, 17.5% are 2, and so on. The pmf 
then describes the distribution of the possible population values 1, 2,..., 10. ea 


Once we have such a population model, we will use it to compute values 
of population characteristics (e.g., the mean w) and make inferences about such 
characteristics. 


A Parameter of a Probability Distribution 


The pmf of the Bernoulli rv X in Example 3.9 was p(0) = .8 and p(1) = .2 
because 20% of all purchasers selected a desktop computer. At another store, 
it may be the case that p(0) = .9 and p(1) = .1. More generally, the pmf of any 
Bernoulli rv can be expressed in the form p(1) = a@ and p(0) = | — a, where 
0<a< 1. Because the pmf depends on the particular value of a, we often 
write p(x; a) rather than just p(x): 


l-a ifx=0 
D(x; a) = a ifx = 1 (3.1) 
0 otherwise 


Then each choice of a in Expression (3.1) yields a different pmf. 


DEFINITION Suppose p(x) depends on a quantity that can be assigned any one of a number 
of possible values, with each different value determining a different probabil- 
ity distribution. Such a quantity is called a parameter of the distribution. The 
collection of all probability distributions for different values of the parameter 
is called a family of probability distributions. 


The quantity a in Expression (3.1) is a parameter. Each different number 
a between 0 and | determines a different member of the Bernoulli family of 
distributions. 


EXAMPLE 3.12 Starting at a fixed time, we observe the gender of each newborn child at a certain 
hospital until a boy (B) is born. Let p = P(B), assume that successive births are 
independent, and define the rv X by x = number of births observed. Then 

pC) = P(X = 1) = P(B) =p 
p(2) = P(X = 2) = P(GB) = P(G) - P(B) = (1 — p)p 


and 
p(3) = P(X = 3) = P(GGB) = P(G) - P(G) - P(B) = (1 — py’p 
Continuing in this way, a general formula emerges: 


l=py yp. #= 1,2, 3... 
po =| py p x 


: 13:2) 
0) otherwise 


The parameter p can assume any value between 0 and 1. Expression (3.2) describes 
the family of geometric distributions. In the gender scenario, p = .51 might be 
appropriate, but if we were looking for the first child with Rh-positive blood, then it 
might be the case that p = .85. H 
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The Cumulative Distribution Function 


For some fixed value x, we often wish to compute the probability that the observed value 
of X will be at most x. For example, let X be the number of number of beds occupied in 
a hospital’s emergency room at a certain time of day; suppose the pmf of X is given by 
x | 0 1 2 3 + 

pa) 1.20 25 30 15.10 


Then the probability that at most two beds are occupied is 
P(X = 2) = pO) + pd) + p@) = .75 


Furthermore, since X = 2.7 if and only if X = 2, we also have P(X = 2.7) = .75, and sim- 
ilarly P(X = 2.999) = .75. Since 0 is the smallest possible X value, P(X = —1.5) = 0, 
P(X = —10) = 0, and in fact for any negative number x, P(X = x) = 0. And because 4 
is the largest possible value of X, P(X = 4) = 1, P(X = 9.8) = 1, and so on. 

Very importantly, 


P(X < 2) = p(0) + p(1) = 45 <.75 = P(X <2) 


because the latter probability includes the probability mass at the x value 2 whereas 
the former probability does not. More generally, P(X < x) < P(X = x) whenever x 
is a possible value of X. Furthermore, P(X = x) is a well-defined and computable 
probability for any number x. 


DEFINITION The cumulative distribution function (cdf) F(x) of a discrete rv variable X 
with pmf p(x) is defined for every number x by 


F@)=PX=x) = DY po) (3.3) 


For any number x, F(x) is the probability that the observed value of X will be 
at most x. 


EXAMPLE 3.13 _ A store carries flash drives with either 1 GB, 2 GB, 4 GB, 8 GB, or 16 GB of mem- 
ory. The accompanying table gives the distribution of Y = the amount of memory in 
a purchased drive: 


y |i 2 4 8 16 


py) | 05 10 35 40 .10 


Let’s first determine F(y) for each of the five possible values of Y: 
FQ) = PY = 1) = P(Y = 1) = pC) = ..05 
F(2) = P(Y S 2) = P(Y = Lor 2) = pC) + pQ) = .15 
F(4) = P(Y = 4) = P(Y = 1 or 2 or 4) = pC) + p(2) + p(4) = .50 
F(8) = PY S 8) = p(1) + p(2) + p(4) + p(s) = .90 
F(16) = P(Y S 16) = 1 


Now for any other number y, F(y) will equal the value of F at the closest possible 
value of Y to the left of y. For example, 


FQ.7) = PY = 2.7) = PY = 2) = FQ) = .15 
F(7.999) = P(¥ = 7.999) = P(Y S 4) = F(4) = .50 
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If y is less than 1, F(y) = 0 [e.g. F(.58) = O], and if y is at least 16, F(y) = | [e-g., 
F(25) = 1]. The cdf is thus 


0 y<l 
OS ls=y<2 
15 2sy<4 
F(y) = 
OF 50 eee 
90 8<y<16 
1 l6<y 


A graph of this cdf is shown in Figure 3.5. 


Fly) 
A 
1.0 - —————> 
¢—_|_@27@. $c ce eKe- 
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0.6 5 
en 
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0.2 5 

0.0 5 — 
T T T T ry 
0 5 10 15 20 

Figure 3.5 A graph of the cdf of Example 3.13 iat} 


For X a discrete rv, the graph of F(x) will have a jump at every possible 
value of X and will be flat between possible values. Such a graph is called a step 
function. 


EXAMPLE 3.14 The pmf of X = the number of births up to and including that of the first boy had 
(Example 3.12 the form 


continued) (—py'p x=1,2,3.... 
P(x) = | 0 otherwise 
For any positive integer x, 
x x= 
F= 3 po)= De pHp > =z (3.4) 
ysx y=l1 y=0 


To evaluate this sum, recall that the partial sum of a geometric series is 
is l-a 
Se = ——- 
y=0 


Using this in Equation (3.4), with a = 1 — p andk = x — 1, gives 


bag oe 
F(x) = p- =1-(1-p) xa positive integer 
Lip) 
Since F is constant in between positive integers, 
F(x) ° ees 3.5) 
x)= : 
1-d-p)"! x=1 
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where [x] is the largest integer = x (e.g., [2.7] = 2). Thus if p = .51 as in the birth 
example, then the probability of having to examine at most five births to see the first 
boy is F(5) = 1 — (.49)° = 1 — .0282 = .9718, whereas F(10) ~ 1.0000. This cdf 
is graphed in Figure 3.6. 


F(x) 
r 
1.05 eo 
——— 
eo——_- 
E—————_ 
_————" SS ie 
0 1 2 3 4 5 50 51 
Figure 3.6 A graph of F(x) for Example 3.14 @ 


In examples thus far, the cdf has been derived from the pmf. This process can 
be reversed to obtain the pmf from the cdf whenever the latter function is available. 
For example, consider again the rv of Example 3.7 (the number of computers being 
used in a lab); possible X values are 0, 1,..., 6. Then 

p(3) = P(X = 3) 
= [p(0) + pd) + p(2) + pB)] — [pO) + pd) + p@)] 
= P(X = 3) — P(X = 2) 
= F3) — FQ) 
More generally, the probability that X falls in a specified interval is easily obtained 
from the cdf. For example, 
P(2 = X = 4) = p(2) + p(3) + pA) 
= [pO) + +» + p(4)] — [pO) + pC] 
= P(X =4)- PX =1) 
= F(4) — Fd) 
Notice that P(2 = X = 4) ¥ F(4) — F(2). This is because the X value 2 is included 


in 2=X <4, so we do not want to subtract out its probability. However, 
PQ < xX =4) = F(4) — F(2) because X = 2 is not in the interval 2< X = 4. 


PROPOSITION For any two numbers a and b with a S b, 
Pia=X =b) = F(b) — F(a—) 


where “a—” represents the largest possible X value that is strictly less than a. In 
particular, if the only possible values are integers and if a and b are integers, then 


PiasX =b) = P(X =aora+t+ lor... ord) 
= F(b) — F(a — 1) 
Taking a = b yields P(X = a) = F(a) — F(a — 1) in this case. 
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The reason for subtracting F(a—) rather than F(a) is that we want to include 
P(X = a); F(b) — F(a) gives Pia < X Sb). This proposition will be used exten- 
sively when computing binomial and Poisson probabilities in Sections 3.4 and 3.6. 


EXAMPLE 3.15 


Let X = the number of days of sick leave taken by a randomly selected employee of 


a large company during a particular year. If the maximum number of allowable 
sick days per year is 14, possible values of X are 0, 1,..., 14. With F(O) = .58, 
F(1) = .72, F(2) = .76, F(3) = .81, F(4) = .88, and F(5) = .94, 


P(2<X <5) = P(X = 2, 3,4, or 5) = FS) — F(1) = .22 


and 


P(X = 3) = F(3) 


F(2) = .05 a 


EXERCISES Section 3.2 (11-28) 


11. 


12. 


13. 


Let X be the number of students who show up for a pro- 

fessor’s office hour on a particular day. Suppose that the 

pmf of X is p(O) = .20, pC) = .25, p(2) = .30, p(3) = 

.15, and p(4) = .10. 

a. Draw the corresponding probability histogram. 

b. What is the probability that at least two students 
show up? More than two students show up? 

c. What is the probability that between one and three 
students, inclusive, show up? 

d. What is the probability that the professor shows up? 


Airlines sometimes overbook flights. Suppose that for a 
plane with 50 seats, 55 passengers have tickets. Define the 
random variable Y as the number of ticketed passengers who 
actually show up for the flight. The probability mass func- 
tion of Y appears in the accompanying table. 


y | 45 46 47 48 49 50 51 52 53 54 55 


PQ) | 05 10 12 14 .25 17 .06 .05 .03 .02 01 


a. What is the probability that the flight will accommo- 
date all ticketed passengers who show up? 

b. What is the probability that not all ticketed passen- 
gers who show up can be accommodated? 

c. If you are the first person on the standby list (which 
means you will be the first one to get on the plane if 
there are any seats available after all ticketed passen- 
gers have been accommodated), what is the probabil- 
ity that you will be able to take the flight? What is 
this probability if you are the third person on the 
standby list? 


A mail-order computer business has six telephone lines. 
Let X denote the number of lines in use at a specified 
time. Suppose the pmf of X is as given in the accompa- 
nying table. 


14. 


15. 


x | 0 1 2 3 4 5 6 


P(x) | 10 AS 20 25 = =.20. 06.04 


Calculate the probability of each of the following events. 
{at most three lines are in use} 

{fewer than three lines are in use} 

{at least three lines are in use} 

{between two and five lines, inclusive, are in use} 


caoee 


{between two and four lines, inclusive, are not in use } 
f. {at least four lines are not in use} 


A contractor is required by a county planning department to 

submit one, two, three, four, or five forms (depending on the 

nature of the project) in applying for a building permit. Let 

Y = the number of forms required of the next applicant. 

The probability that y forms are required is known to be 

proportional to y—that is, p(y) = ky fory = 1,...,5. 

a. What is the value of k? [Hint: 3_, p(y) = 1] 

b. What is the probability that at most three forms are 
required? 

c. What is the probability that between two and four 
forms (inclusive) are required? 

d. Could p(y) = y’/50 for y = 1,..., 5 be the pmf of Y? 


Many manufacturers have quality control programs that 
include inspection of incoming materials for defects. 
Suppose a computer manufacturer receives circuit boards 
in batches of five. Two boards are selected from each 
batch for inspection. We can represent possible outcomes 
of the selection process by pairs. For example, the pair 
(1, 2) represents the selection of boards 1 and 2 for 
inspection. 
a. List the ten different possible outcomes. 
b. Suppose that boards | and 2 are the only defective 
boards in a batch. Two boards are to be chosen at 
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16. 


17. 


18. 


19. 


20. 


random. Define X to be the number of defective 
boards observed among those inspected. Find the 
probability distribution of X. 

c. Let F(x) denote the cdf of X. First determine F(0) = 
P(X =0), FC), and F(2); then obtain F(x) for all 
other x. 


Some parts of California are particularly earthquake- 
prone. Suppose that in one metropolitan area, 25% of all 
homeowners are insured against earthquake damage. 
Four homeowners are to be selected at random; let X 
denote the number among the four who have earthquake 
insurance. 

a. Find the probability distribution of X. [Hint: Let S 
denote a homeowner who has insurance and F one 
who does not. Then one possible outcome is SF'SS, 
with probability (.25)(.75)(.25)(.25) and associated X 
value 3. There are 15 other outcomes. ] 

b. Draw the corresponding probability histogram. 

What is the most likely value for X? 

d. What is the probability that at least two of the four 
selected have earthquake insurance? 


° 


A new battery’s voltage may be acceptable (A) or unac- 
ceptable (U). A certain flashlight requires two batteries, 
so batteries will be independently selected and tested until 
two acceptable ones have been found. Suppose that 90% 
of all batteries have acceptable voltages. Let Y denote the 
number of batteries that must be tested. 

a. What is p(2), that is, P(Y = 2)? 

b. What is p(3)? [Hint: There are two different out- 
comes that result in Y = 3.] 

c. To have Y = 5, what must be true of the fifth battery 
selected? List the four outcomes for which Y = 5 and 
then determine p(5). 

d. Use the pattern in your answers for parts (a)—(c) to 
obtain a general formula for p(y). 


Two fair six-sided dice are tossed independently. Let 

M = the maximum of the two tosses (so M(1,5) = 5, 

M(3,3) = 3, etc.). 

a. What is the pmf of M? [Hint: First determine p(1), 
then p(2), and so on.] 

b. Determine the cdf of M and graph it. 


A library subscribes to two different weekly news maga- 
zines, each of which is supposed to arrive in Wednesday’s 
mail. In actuality, each one may arrive on Wednesday, 
Thursday, Friday, or Saturday. Suppose the two arrive inde- 
pendently of one another, and for each one P(Wed.) = .3, 
P(Thurs.) = .4,  P(Fri.) = .2,. and P(Sat.) = .1. Let 
Y = the number of days beyond Wednesday that it takes 
for both magazines to arrive (so possible Y values are 0, 1, 
2, or 3). Compute the pmf of Y. [Hint: There are 16 possible 
outcomes; Y(W,W) = 0, Y(F,Th) = 2, and so on.] 


Three couples and two single individuals have been 
invited to an investment seminar and have agreed to 
attend. Suppose the probability that any particular 
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21. 


22. 


23. 


couple or individual arrives late is .4 (a couple will 
travel together in the same vehicle, so either both people 
will be on time or else both will arrive late). Assume that 
different couples and individuals are on time or late 
independently of one another. Let X = the number of 
people who arrive late for the seminar. 

a. Determine the probability mass function of X. [Hint: 
label the three couples #1, #2, and #3 and the two 
individuals #4 and #5.] 

b. Obtain the cumulative distribution function of X, and 
use it to calculate P(2 = X = 6). 


Suppose that you read through this year’s issues of the 
New York Times and record each number that appears in 
a news article—the income of a CEO, the number of 
cases of wine produced by a winery, the total charitable 
contribution of a politician during the previous tax year, 
the age of a celebrity, and so on. Now focus on the lead- 
ing digit of each number, which could be 1, 2,..., 8, or 9. 
Your first thought might be that the leading digit X of a 
randomly selected number would be equally likely to be 
one of the nine possibilities (a discrete uniform distribu- 
tion). However, much empirical evidence as well as 
some theoretical arguments suggest an alternative prob- 
ability distribution called Benford’s law: 


4 Me, x+1 
p(x) = PCist digit is x) lo x= 1,2,...,9 
x 


a. Without computing individual probabilities from this 
formula, show that it specifies a legitimate pmf. 

b. Now compute the individual probabilities and compare 
to the corresponding discrete uniform distribution. 

c. Obtain the cdf of X. 

d. Using the cdf, what is the probability that the leading 
digit is at most 3? At least 5? 

[Note: Benford’s law is the basis for some auditing pro- 

cedures used to detect fraud in financial reporting—for 

example, by the Internal Revenue Service.] 


Refer to Exercise 13, and calculate and graph the cdf 
F(x). Then use it to calculate the probabilities of the 
events given in parts (a)—(d) of that problem. 


A branch of a certain bank in New York City has six 
ATMs. Let X represent the number of machines in use at 
a particular time of day. The cdf of X is as follows: 


0 x«<0 
06 0OSx<l1 
19 1[<x<2 

FQ) = 39 25x<3 
67 3<x<4 
92 45x<5 
97 5<=x<6 
1 65x 


Calculate the following probabilities directly from the 
cdf: 


a. p(2), that is, P(X = 2) 
e P(2S=X 5) 


b. P(X > 3) 

d. P(2<X <5) 

24. An insurance company offers its policyholders a num- 
ber of different premium payment options. For a ran- 
domly selected policyholder, let X = the number of 
months between successive payments. The cdf of X is as 
follows: 


0 «<1 

30 1sx<3 
40 3<x<4 
45 45x<6 
60 65x< 12 
1 12sx 


F(x) = 


a. What is the pmf of X? 
b. Using just the cdf, compute P(3 =X =6) and 
P(4sX). 


25. In Example 3.12, let Y = the number of girls born before 
the experiment terminates. With p= P(B) and 
1 — p = P(G), what is the pmf of Y? [Hint: First list the 
possible values of Y, starting with the smallest, and pro- 
ceed until you see a general formula. ] 


26. Alvie Singer lives at 0 in the accompanying diagram and 
has four friends who live at A, B, C, and D. One day 
Alvie decides to go visiting, so he tosses a fair coin twice 
to decide which of the four to visit. Once at a friend’s 
house, he will either return home or else proceed to one 
of the two adjacent houses (such as 0, A, or C when at B), 
with each of the three possibilities having probability 


27. 


28. 
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1/3. In this way, Alvie continues to visit friends until he 


returns home. 
A A B 


D Cc 


a. Let X =the number of times that Alvie visits a 
friend. Derive the pmf of X. 

b. Let Y = the number of straight-line segments that 
Alvie traverses (including those leading to and from 
0). What is the pmf of Y? 

c. Suppose that female friends live at A and C and male 
friends at B and D. If Z = the number of visits to 
female friends, what is the pmf of Z? 


After all students have left the classroom, a Statistics 
professor notices that four copies of the text were left 
under desks. At the beginning of the next lecture, the 
professor distributes the four books in a completely ran- 
dom fashion to each of the four students (1, 2, 3, and 4) 
who claim to have left books. One possible outcome is 
that | receives 2’s book, 2 receives 4’s book, 3 receives 
his or her own book, and 4 receives 1’s book. This out- 
come can be abbreviated as (2, 4, 3, 1). 

a. List the other 23 possible outcomes. 

b. Let X denote the number of students who receive 

their own book. Determine the pmf of X. 


Show that the cdf F(x) is a nondecreasing function; that 
is, x; <x, implies that F(x,) = F(x,). Under what con- 
dition will F(x,) = F(x)? 


3.5 Expected Values 


Consider a university having 15,000 students and let X = the number of courses 
for which a randomly selected student is registered. The pmf of X follows. Since 
p(1) = .01, we know that (.01) - (15,000) = 150 of the students are registered for 
one course, and similarly for the other x values. 


x 1 3 4 5 6 7 
p(x) 01 03 13 25 397 02 (3.6) 
Number registered 150 450 1950 3750 5850 2550 300 


The average number of courses per student, or the average value of X in the 
population, results from computing the total number of courses taken by all students 
and dividing by the total number of students. Since each of 150 students is taking 
one course, these 150 contribute 150 courses to the total. Similarly, 450 students 
contribute 2(450) courses, and so on. The population average value of X is then 


1(150) + 2(450) + 3(1950) + --- + 7(300) 


= 4.57 (3.7) 


15,000 
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Since 150/15,000 = .01 = p(1), 450/15,000 = .03 = p(2), and so on, an alternative 
expression for (3.7) is 


1- pd) +2°-pQ)+-+7- pl) (3.8) 


Expression (3.8) shows that to compute the population average value of X, 
we need only the possible values of X along with their probabilities (proportions). 
In particular, the population size is irrelevant as long as the pmf is given by (3.6). 
The average or mean value of X is then a weighted average of the possible values 
1,..., 7, where the weights are the probabilities of those values. 


The Expected Value of X 


DEFINITION Let X be a discrete rv with set of possible values D and pmf p(x). The expected 
value or mean value of X, denoted by E(X) or pry or just pu, is 


E(X) = py = >)x + p@) 


xeD 


EXAMPLE 3.16 For the pmf of X = number of courses in (3.6), 
M=1- pl) +2: pQ)+-+7- pi) 
(1)..01) + 2(.03) + --- + (7)(.02) 
.0O1 + .06 + .39 + 1.00 + 1.95 + 1.02 + .14 = 4.57 


If we think of the population as consisting of the X values 1, 2,..., 7, then w = 4.57 
is the population mean. In the sequel, we will often refer to px as the population mean 
rather than the mean of X in the population. Notice that py here is not 4, the ordinary 
average of 1,..., 7, because the distribution puts more weight on 4, 5, and 6 than on 
other X values. a 


In Example 3.16, the expected value jz was 4.57, which is not a possible value 
of X. The word expected should be interpreted with caution because one would not 
expect to see an X value of 4.57 when a single student is selected. 


EXAMPLE 3.17 Just after birth, each newborn child is rated on a scale called the Apgar scale. The 
possible ratings are 0, 1,..., 10, with the child’s rating determined by color, mus- 
cle tone, respiratory effort, heartbeat, and reflex irritability (the best possible score 
is 10). Let X be the Apgar score of a randomly selected child born at a certain hos- 
pital during the next year, and suppose that the pmf of X is 


x | 0 l a 3 4 & 6 F 8 & 0 
pox) | 002.001.» 002, «005, 02s B37 251s 


Then the mean value of X is 
E(X) = p = 0(.002) + 1(.001) + 2(.002) 
+ +++ + 8(.25) + 9(.12) + 10(.01) 
= 7.15 
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Again, pu is not a possible value of the variable X. Also, because the variable relates 
to a future child, there is no concrete existing population to which wp refers. Instead, 
we think of the pmf as a model for a conceptual population consisting of the values 
0, 1, 2,..., 10. The mean value of this conceptual population is then » = 7.15. 


EXAMPLE 3.18 Let X = 1 if a randomly selected vehicle passes an emissions test and X = 0 other- 
wise. Then X is a Bernoulli rv with pmf p(1) = p and p(O) = | — p, from which 
E(X) = 0: p(O) + 1 - pC.) = O01 — p) + 1~) =p. That is, the expected value of 
X is just the probability that X takes on the value |. If we conceptualize a popula- 
tion consisting of Os in proportion 1 — p and Is in proportion p, then the population 
average is = p. ei 


EXAMPLE 3.19 The general form for the pmf of X = the number of children born up to and includ- 
ing the first boy is 
(x) pd - py! x=1,2,3,... 
x) = 
B 0 otherwise 
From the definition, 


oat 


= d 
E(X) = >)x- p@®) = Dd)xp0 p= ap py (3.9) 
D x=1 


x=1 


Interchanging the order of taking the derivative and the summation, the sum is 
that of a geometric series. After the sum is computed, the derivative is taken, 
resulting in E(X) = 1/p. If p is near 1, we expect to see a boy very soon, whereas 
if p is near 0, we expect many births before the first boy. For p = .5, E(X) = 2. @ 


There is another frequently used interpretation of w. Consider observing a 
first value x, of X, then a second value x,, a third value x,, and so on. After doing 
this a large number of times, calculate the sample average of the observed x;s. This 
average will typically be quite close to jz. That is, can be interpreted as the long- 
run average observed value of X when the experiment is performed repeatedly. In 
Example 3.17, the long-run average Apgar score is w = 7.15. 


EXAMPLE 3.20 Let X, the number of interviews a student has prior to getting a job, have pmf 


() = Kix KH 23s ss 
an 0 otherwise 


where k = 77/6 insures that =p(x) = 1 (the value of k comes from a result in Fourier 
series). The expected value of X is 


SE S41 
B= EX) = Dx- iat oie (3.10) 
x=1 


x=1 x 


The sum on the right of Equation (3.10) is the famous harmonic series of 
mathematics and can be shown to equal ~. E(X) is not finite here because p(x) does 
not decrease sufficiently fast as x increases; statisticians say that the probability 
distribution of X has “a heavy tail.” If a sequence of X values is chosen using this 
distribution, the sample average will not settle down to some finite number but will 
tend to grow without bound. 

Statisticians use the phrase “heavy tails” in connection with any distribution 
having a large amount of probability far from ps (so heavy tails do not require ~ = %). 
Such heavy tails make it difficult to make inferences about pw. i 
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The Expected Value of a Function 


Sometimes interest will focus on the expected value of some function h(X) rather 
than on just E(X). 


EXAMPLE 3.21 Suppose a bookstore purchases ten copies of a book at $6.00 each to sell at $12.00 
with the understanding that at the end of a 3-month period any unsold copies can be 
redeemed for $2.00. If X¥ = the number of copies sold, then net revenue = h(X) = 
12X + 2(10 — X) — 60 = 10X — 40. In this situation, we might be interested not 
only in the expected number of copies sold [i.e., E(X)] but also in the expected net 
revenue—that is, the expected value of a particular function of X. ia 


An easy way of computing the expected value of h(X) is suggested by the fol- 
lowing example. 


EXAMPLE 3.22 The cost of a certain vehicle diagnostic test depends on the number of cylinders X in 
the vehicle’s engine. Suppose the cost function is given by h(X) = 20 + 3X + .5X?. 
Since X is a random variable, so is Y = h(X). The pmf of X and derived pmf of Y are 


as follows: 

x | 4 6 8 y | 40 56 76 
=> 

pe | 5S 3 2 pe) | 5 3 2 


With D* denoting possible values of Y, 


E(Y) = E[AX)] = Sy - pO) 
D* 
= (40)(.5) + (56)(.3) + (76)(.2) (3.11) 
= h(4) - (5) + h(6) - (3) + A(8) + ©.2) 
= DAG) - pe) 
D 
According to Equation (3.11), it was not necessary to determine the pmf of Y to 


obtain E(Y); instead, the desired expected value is a weighted average of the possible 
h(x) (rather than x) values. ia 


PROPOSITION If the rv X has a set of possible values D and pmf p(x), then the expected value 
of any function h(x), denoted by E[h(X)] or fy), is computed by 


E{h(X)] = SAC) - po) 
D 


That is, E[i(X)] is computed in the same way that F(X) itself is, except that 
h(x) is substituted in place of x. 


EXAMPLE 3.23 A ccomputer store has purchased three computers of a certain type at $500 apiece. 
It will sell them for $1000 apiece. The manufacturer has agreed to repurchase 
any computers still unsold after a specified period at $200 apiece. Let X 
denote the number of computers sold, and suppose that p(O) = .1, p(1) = .2, 
p(2) = .3, and p(3) =.4. With h(X) denoting the profit associated with 
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selling X units, the given information implies that h(X) = revenue — cost = 
1000X + 200(3 — X) — 1500 = 800X — 900. The expected profit is then 


E{h(X)] = hO) - pO) + hC) - pd) + bh) + p(2) + hB) - p(B) 
= (—900)(.1) + (—100)(.2) + (700)(.3) + (1500)(.4) 
= $700 | 


Expected Value of a Linear Function 


The A(X) function of interest is quite frequently a linear function aX + b. In this case, 
E{h(X)] is easily computed from E(X) without the need for additional summation. 


PROPOSITION (aX +b) =a EX) +b 


(Or, using alternative notation, W,y., = 4° My + bd) 


To paraphrase, the expected value of a linear function equals the linear func- 
tion evaluated at the expected value E(X). Since h(X) in Example 3.23 is linear and 
E(X) = 2, E[h(X)] = 800(2) — 900 = $700, as before. 


Proof 
E(aX + b) = Si(ax + b) + pd) = ad x- po) + b> po 
D D D 
= aE(X) + b 3] 


Two special cases of the proposition yield two important rules of expected value. 


1. For any constant a, E(aX) = a - E(X) (take b = 0). 
2. For any constant b, E(X + b) = E(X) + b (take a = 1). 


(3.12) 


Multiplication of X by a constant a typically changes the unit of measurement, 
for example, from inches to cm, where a = 2.54. Rule | says that the expected value 
in the new units equals the expected value in the old units multiplied by the conver- 
sion factor a. Similarly, if a constant b is added to each possible value of X, then the 
expected value will be shifted by that same constant amount. 


The Variance of X 


The expected value of X describes where the probability distribution is centered. Using 
the physical analogy of placing point mass p(x) at the value x on a one-dimensional 
axis, if the axis were then supported by a fulcrum placed at ps, there would be no ten- 
dency for the axis to tilt. This is illustrated for two different distributions in Figure 3.7. 


p(x) 4 P(x) 


5 o 


Figure 3.7 Two different probability distributions with 4 = 4 
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Although both distributions pictured in Figure 3.7 have the same center p, 
the distribution of Figure 3.7(b) has greater spread (i.e., variability or dispersion) 
than does that of Figure 3.7(a). We will use the variance of X to assess the amount 
of variability in (the distribution of) X, just as s* was used in Chapter | to measure 
variability in a sample. 


DEFINITION Let X have pmf p(x) and expected value yz. Then the variance of X, denoted 
by V(X) or o%, or just 0”, is 


VX) = > @ — pw)? - pd) = EX — p)] 
D 


The standard deviation (SD) of X is 


oy = V3 


The quantity h(X) = (X — pw) is the squared deviation of X from its mean, 
and o? is the expected squared deviation—i.e., the weighted average of squared 
deviations, where the weights are probabilities from the distribution. If most of the 
probability distribution is close to yz, then o? will be relatively small. However, if 
there are x values far from p that have large p(x), then o? will be quite large. Very 
roughly, o can be interpreted as the size of a representative deviation from the mean 
value yu. So if o = 10, then in a long sequence of observed X values, some will 
deviate from yz by more than 10 while others will be closer to the mean than that—a 
typical deviation from the mean will be something on the order of 10. 


EXAMPLE 3.24 A library has an upper limit of 6 on the number of DVDs that can be checked out to 
an individual at one time. Consider only those who currently have DVDs checked 
out, and let X denote the number of DVDs checked out to a randomly selected indi- 
vidual. The pmf of X is as follows: 


x | 1 2 3 4 5 6 


p(x) | 30 25 15 05 10 15 
The expected value of X is easily seen to be w = 2.85. The variance of X is then 


6 
VX) = 2 = Si — 2.85)? - p(x) 


x=1 


= (1 — 2.85)*(.30) + (2 — 2.85)7(.25) + ++» + (6 — 2.85)°(.15) = 3.2275 
The standard deviation of X is @ = V 3.2275 = 1.800. | 


When the pmf p(x) specifies a mathematical model for the distribution of 
population values, both o* and o measure the spread of values in the population; o” 
is the population variance, and o is the population standard deviation. 


A Shortcut Formula for a2 


The number of arithmetic operations necessary to compute o” can be reduced by 
using an alternative formula. 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


PROPOSITION 


EXAMPLE 3.25 
(Example 3.24 
continued) 


PROPOSITION 


3.3 Expected Values 115 


V(X) = o? = ba poo — w? = E(X?) — [EOP 


D 


In using this formula, E(X?) is computed first without any subtraction; then E(X) is 
computed, squared, and subtracted (once) from E(X’). 


The pmf of the number X of DVDs checked out was given as p(1) = .30, p(2) = .25, 
p(3) = .15, p(4) = .05, p(5) = .10, and p(6) = .15, from which ww = 2.85 and 


6 
E(X’) = yx + p(x) = (17)(.30) + (27)(.25) + +++ + (67)0.15) = 11.35 
x=1 


Thus o? = 11.35 — (2.85)? = 3.2275 as obtained previously from the definition. 


Proof of the Shortcut Formula Expand (x — 2) in the definition of o* to 
obtain x? — 2u.x + wx”, and then carry > through to each of the three terms: 


a = Sx? + px) — 2m Sx- po) + WS po) 
D D D 
= EX?) — 2p. pe + p? = EQ?) — pe? a 


Variance of a Linear Function 


The variance of h(X) is the expected value of the squared difference between h(X) 
and its expected value: 


VAX) = OF) = (AO) — ELA} + px) (3.13) 
D 


When h(X) = aX + 5, a linear function, 
h(x) — E[h(X)] = ax + b — (ap + b) = ax — p) 
Substituting this into (3.13) gives a simple relationship between V[A(X)] and V(X): 


V(aX + b) = 02,,,=@-o% and o,,,,= lal - oy 
In particular, 


Gy = lal Cy Cx45 = Oy (3.14) 


The absolute value is necessary because a might be negative, yet a standard 
deviation cannot be. Usually multiplication by a corresponds to a change in the unit 
of measurement (e.g., kg to lb or dollars to euros). According to the first relation in 
(3.14), the sd in the new unit is the original sd multiplied by the conversion factor. 
The second relation says that adding or subtracting a constant does not impact vari- 
ability; it just rigidly shifts the distribution to the right or left. 
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CHAPTER 3 Discrete Random Variables and Probability Distributions 


EXAMPLE 3.26 


In the computer sales scenario of Example 3.23, E(X) = 2 and 


E(X’) = (0970.1) + (1)°(.2) + (2)°C.3) + (3)°(.4) = 5 


so V(X) = 5 — (2)? = 1. The profit function A(X) = 800X — 900 then has variance 
(800)? - V(X) = (640,000)(1) = 640,000 and standard deviation 800. |_| 


EXERCISES Section 3.3 (29-45) 


29. 


30. 


31. 


32. 


The pmf of the amount of memory X (GB) in a purchased 
flash drive was given in Example 3.13 as 


x | 1 2 4 8 16 


p(x) | 05 10 35 40 10 


Compute the following: 

a. E(X) 

b. V(X) directly from the definition 

c. The standard deviation of X 

d. V(X) using the shortcut formula 

An individual who has automobile insurance from a 
certain company is randomly selected. Let Y be the num- 
ber of moving violations for which the individual was 
cited during the last 3 years. The pmf of Y is 


y |_0 1 2 3 
py) 


a. Compute E(Y). 

b. Suppose an individual with Y violations incurs a 
surcharge of $100Y7. Calculate the expected amount 
of the surcharge. 


Refer to Exercise 12 and calculate V(Y) and o,. Then 
determine the probability that Y is within | standard devi- 
ation of its mean value. 


| 6 25 10 05 


A certain brand of upright freezer is available in three 
different rated capacities: 16 ft*, 18 ft?, and 20 ft?. Let 
X = the rated capacity of a freezer of this brand sold at 
a certain store. Suppose that X has pmf 


x | 16 18 20 
D(x) | 2 i) 3 


a. Compute E(X), E(X’), and V(X). 

b. If the price of a freezer having capacity X is 
70X — 650, what is the expected price paid by the 
next customer to buy a freezer? 


c. What is the variance of the price paid by the next 
customer? 
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33. 


34. 


35. 


36. 


d. Suppose that although the rated capacity of a freezer 
is X, the actual capacity is h(x) = X — .008X?. What 
is the expected actual capacity of the freezer pur- 
chased by the next customer? 


Let X be a Bernoulli rv with pmf as in Example 3.18. 

a. Compute E(X?). 

b. Show that V(X) = p( — p). 

c. Compute E(X’’). 

Suppose that the number of plants of a particular type 
found in a rectangular sampling region (called a quadrat 


by ecologists) in a certain geographic area is an rv X with 
pmf 


() Gfx? 6 = 15.2). 352: 
¥) = 
: 0 otherwise 


Is E(X) finite? Justify your answer (this is another distri- 
bution that statisticians would call heavy-tailed). 


A small market orders copies of a certain magazine for 
its magazine rack each week. Let X = demand for the 
magazine, with pmf 


x 1 2, 3 4 5 6 
1 2 3 4 3 2 
PO) 15 15 15 15 15 15 


Suppose the store owner actually pays $2.00 for each 
copy of the magazine and the price to customers is $4.00. 
If magazines left at the end of the week have no salvage 
value, is it better to order three or four copies of the 
magazine? [Hint: For both three and four copies ordered, 
express net revenue as a function of demand X, and then 
compute the expected revenue. ] 


Let X be the damage incurred (in $) in a certain type of 
accident during a given year. Possible X values are 0, 
1000, 5000, and 10000, with probabilities .8, .1, .08, 
and .02, respectively. A particular company offers a 
$500 deductible policy. If the company wishes its 
expected profit to be $100, what premium amount 
should it charge? 


37. 


38. 


39. 


40. 


The n candidates for a job have been ranked 1, 2, 3,..., 7. 
Let X = the rank of a randomly selected candidate, so 
that X has pmf 


(x) I/n x=1,2,3,...,n 
x) = 
P 0) otherwise 


(this is called the discrete uniform distribution). Compute 
E(X) and V(X) using the shortcut formula. [Hint: The 
sum of the first n positive integers is n(n + 1)/2, 
whereas the sum of their squares is n(n + 1)(2n + 1)/6.] 


Possible values of X, the number of components in a 

system submitted for repair that must be replaced, are 1, 

2, 3, and 4 with corresponding probabilities .15, .35, .35, 

and .15, respectively. 

a. Calculate E(X) and then E(5 — X). 

b. Would the repair facility be better off charging a 
flat fee of $75 or else the amount $[150/(5 — X)]? 
[Note: It is not generally true that E(c/Y) = c/E(Y).] 


A chemical supply company currently has in stock 
100 Ib of a certain chemical, which it sells to custom- 
ers in 5-lb batches. Let X = the number of batches 
ordered by a randomly chosen customer, and suppose 
that X has pmf 


¥ | 1 2 3 4 
po) | 2 


Compute E(X) and V(X). Then compute the expected 
number of pounds left after the next customer’s order is 
shipped and the variance of the number of pounds left. 
[Hint: The number of pounds left is a linear function of X.] 


a. Draw a line graph of the pmf of X in Exercise 35. 
Then determine the pmf of —X and draw its line 


41. 


42. 


43. 


44. 


45. 
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graph. From these two pictures, what can you say 
about V(X) and V(—X)? 

b. Use the proposition involving V(aX + b) to establish 
a general relationship between V(X) and V(—X). 


Use the definition in Expression (3.13) to prove that 
V(iaX + b) =a? +o}. [Hint: With h(X) = aX +b, 
E[h(X)] = aw + b where wp = E(X).] 
Suppose E(X) = 5 and E[X(X — 1)] = 27.5. What is 
a. E(X?)? [Hint: First verify that ELX(X — 1)] = 
E(X’) — E(X)]? 
V(X)? 
c. The general relationship among the quantities E(X), 
E|X(X — 1)], and V(X)? 


Write a general rule for E(X — c) where c is a constant. 
What happens when c = y, the expected value of X? 


A result called Chebyshev’s inequality states that for 

any probability distribution of an rv X and any number k 

that is at least 1, P(|X — | = ko) S 1/K. In words, the 

probability that the value of X lies at least k standard 

deviations from its mean is at most 1/k?. 

a. What is the value of the upper bound for k = 2? 
k=372?k=42k=5?k=10? 

b. Compute w and o for the distribution of Exercise 
13. Then evaluate P(|X — | = ko) for the values 
of k given in part (a). What does this suggest about 
the upper bound relative to the corresponding 
probability? 


c. Let X have possible values —1, 0, and 1, with probabil- 
ities i 5 and 7, respectively. Whatis P(|X — u| = 30), 
and how does it compare to the corresponding bound? 

d. Givea distribution for which P(|X — | = 5a) = .04. 


Ifa =X Sb, show that a S E(X) Sb. 


3.4 The Binomial Probability Distribution 


There are many experiments that conform either exactly or approximately to 
the following list of requirements: 


1. The experiment consists of a sequence of n smaller experiments called trials, 
where n is fixed in advance of the experiment. 


2. Each trial can result in one of the same two possible outcomes (dichoto- 
mous trials), which we generically denote by success (S) and failure (F). 
The assignment of the S and F labels to the two sides of the dichotomy is 


arbitrary. 


3. The trials are independent, so that the outcome on any particular trial does not 
influence the outcome on any other trial. 


4. The probability of success P(S) is constant from trial to trial; we denote this 


probability by p. 
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DEFINITION An experiment for which Conditions 1-4 (a fixed number of dichotomous, 
independent, homogenous trials) are satisfied is called a binomial experiment. 


EXAMPLE 3.27 Consider each of the next n vehicles undergoing an emissions test, and let S denote 
a vehicle that passes the test and F denote one that fails to pass. Then this experi- 
ment satisfies Conditions 1-4. Tossing a thumbtack n times, with S = point up and 
F = point down, also results in a binomial experiment, as would the experiment in 
which the gender (S for female and F for male) is determined for each of the next 
children born at a particular hospital. a 


Many experiments involve a sequence of independent trials for which there are 
more than two possible outcomes on any one trial. A binomial experiment can then 
be created by dividing the possible outcomes into two groups. 


EXAMPLE 3.28 The color of pea seeds is determined by a single genetic locus. If the two alleles 
at this locus are AA or Aa (the genotype), then the pea will be yellow (the phe- 
notype), and if the allele is aa, the pea will be green. Suppose we pair off 20 Aa 
seeds and cross the two seeds in each of the ten pairs to obtain ten new geno- 
types. Call each new genotype a success S if it is aa and a failure otherwise. Then 
with this identification of S and F, the experiment is binomial with n = 10 and 
p = P(aa genotype). If each member of the pair is equally likely to contribute a or 
A, then p = P(a) - P(a) = (.5)(.5) = .25. Bo 


EXAMPLE 3.29 The pool of prospective jurors for a certain case consists of 50 individuals, of whom 
35 are employed. Suppose that 6 of these individuals are randomly selected one by 
one to sit in the jury box for initial questioning by lawyers for the defense and the 
prosecution. Label the ith person selected (the ith trial) as a success S if he or she is 
employed and a failure F otherwise. Then 


35 
P(S on first trial) = —~ = .70 
(S on first trial) 50 


and 
P(S on second trial) = P(SS) + P(FS) 
= P(second S | first S)P(first S) 
+ P(second S| first F)P(first F) 
-2.3 +3 3 -3(B+2)-2 
49 50 49 50 50\49 49 50 


= .70 


Similarly, it can be shown that P(S on ith trial) = .70 for i = 3, 4,5, 6. However, 
if the first five individuals selected are all S, then only 30 Ss remain for the sixth 
selection. Thus, 


P(S on sixth trial | SSSSS) = 30/45 = .667 
whereas 
P(S on sixth trial | FFFFF) = 35/45 = .778 
The experiment is not binomial because the trials are not independent. In general, 


if sampling is without replacement, the experiment will not yield independent trials. 
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Now consider a large county that has 500,000 individuals in its jury pool, of 
whom 400,000 are employed. A sample of 10 individuals from the pool is chosen 
without replacement. Again the ith trial is regarded as a success S if the ith individual is 
employed. The important difference between this and the previous scenario is that the 
size of the population being sampled is very large relative to the sample size. In this case 


PS oid Seni = e000 
on on 1) = =. 
499,999 
and 
399,991 
P. 1 first 9) = =.7 =, 
(S on 10 | S on first 9) 499,991 99996 ~ .8000 
400,000 
P. 10 | F on first 9) = =, 14=. 
(S on 10 | F on first 9) 499.991 8000 8000 


These calculations suggest that although the trials are not exactly independent, 
the conditional probabilities differ so slightly from one another that for practical 
purposes the trials can be regarded as independent with constant P(S) = .8. Thus, to 
a very good approximation, the experiment is binomial with n = 10 and p= .8. & 


We will use the following rule of thumb in deciding whether a “without- 
replacement” experiment can be treated as being binomial. 


RULE Consider sampling without replacement from a dichotomous population of 
size N. If the sample size (number of trials) n is at most 5% of the population 
size, the experiment can be analyzed as though it were a binomial experiment. 


By “analyzed,’ we mean that probabilities based on the binomial experiment 
assumptions will be quite close to the actual “without-replacement” probabilities, 
which are typically more difficult to calculate. In the first scenario of Example 3.29, 
n/N = 6/50 = .12 > .05, so the binomial experiment is not a good approximation, 
but in the second scenario, n/N = 10/500,000 << .05. 


The Binomial Random Variable 
and Distribution 


In most binomial experiments, it is the total number of S’s, rather than knowledge of 
exactly which trials yielded S’s, that is of interest. 


DEFINITION The binomial random variable X associated with a binomial experiment 
consisting of n trials is defined as 


X = the number of S’s among the 7 trials 


Suppose, for example, that n = 3. Then there are eight possible outcomes for the 
experiment: 


SSS SSF SFS SFF FSS FSF FFS FFF 


From the definition of X, X(SSF) = 2, X(SFF) = 1, and so on. Possible values for 
X in an n-trial experiment are x = 0, 1, 2,...,. We will often write X ~ Bin(n, p) 
to indicate that X is a binomial rv based on n trials with success probability p. 
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NOTATION Because the pmf of a binomial rv X depends on the two parameters n and p, 
we denote the pmf by b(; n, p). 


Consider first the case n = 4 for which each outcome, its probability, and corre- 
sponding x value are displayed in Table 3.1. For example, 


P(SSFS) = P(S) - P(S) - PF) - P(S) (independent trials) 
=p:-p-:(1—p)-p_ (constant P(S)) 
=p =p) 


Table 3.1 Outcomes and Probabilities for a Binomial Experiment with Four Trials 


Outcome x Probability Outcome x Probability 
SSSS 4 p* FSSS 3 pi — p) 
SSSF 3 p(1 — p) FSSF 2 pl — py? 
SSFS 3 p(1 — p) FSFS 3 pl — py 
SSFF 2 pl — py FSFF 1 pd — py 
SFSS 3 pl — p) FFSS ) pl — py? 
SFSF 2 pl — py FFSF 1 pd — py 
SFFS p) pl — py? FFFS 1 pl — py 
SFFF 1 pl — py FFFF 0 (1 — p) 


In this special case, we wish b(x; 4, p) for x = 0, 1, 2, 3, and 4. For b(3; 4, p), 
let’s identify which of the 16 outcomes yield an x value of 3 and sum the probabili- 
ties associated with each such outcome: 


b(3; 4, p) = P(FSSS) + P(SFSS) + P(SSFS) + P(SSSF) = 4p°(1 — p) 


There are four outcomes with X = 3 and each has probability p°(1 — p) (the order 
of S’s and F’s is not important, only the number of S’s), so 


b(3: 4, p) = number of outcomes |__| probability of any particular 
mare with X = 3 outcome with X = 3 


Similarly, b(2; 4, p) = 6p?(1 — p)’, which is also the product of the number of out- 
comes with X = 2 and the probability of any such outcome. 
In general, 


number of sequences of _ J probability of any 
length n consisting of x S’s 


b(x; n, p) = | 


particular such sequence 


Since the ordering of S’s and F’s is not important, the second factor in the previ- 
ous equation is p*(1 — p)"~~* (e.g., the first x trials resulting in S and the last n — x 
resulting in F’). The first factor is the number of ways of choosing x of the n trials to 
be S’s—that is, the number of combinations of size x that can be constructed from n 
distinct objects (trials here). 


ee) ei ee 


0 otherwise 


THEOREM b@;n, p) = 
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EXAMPLE 3.30 Each of six randomly selected cola drinkers is given a glass containing cola S and one 
containing cola F: The glasses are identical in appearance except for a code on the 
bottom to identify the cola. Suppose there is actually no tendency among cola drink- 
ers to prefer one cola to the other. Then p = P(a selected individual prefers S) = .5, 
so with X = the number among the six who prefer S, X ~ Bin(6,.5). 

Thus 


P(X = 3) = b(3; 6, .5) = (Pores = 20(.5)° = .313 


The probability that at least three prefer S is 


6 6 6 
PB =X) = SG; 6,5) = > (Secor = 656 
x=3 x=3 
and the probability that at most one prefers S is 
1 
P(X = 1) = Sd; 6, .5) = .109 a 
x=0 


Using Binomial Tables* 


Even for a relatively small value of n, the computation of binomial prob- 
abilities can be tedious. Appendix Table A.1 tabulates the cdf F(x) = P(X = x) for 
n = 5, 10, 15, 20, 25 in combination with selected values of p corresponding to dif- 
ferent columns of the table. Various other probabilities can then be calculated using 
the proposition on cdf’s from Section 3.2. A table entry of 0 signifies only that the 
probability is 0 to three significant digits since all table entries are actually positive. 


NOTATION For X ~ Bin(n, p), the cdf will be denoted by 


Ba; n, p) = PX =x) = S v0: nj) 56 =) Meccon 
y=0 


EXAMPLE 3.31 Suppose that 20% of all copies of a particular textbook fail a certain binding strength 
test. Let X denote the number among 15 randomly selected copies that fail the test. 
Then X has a binomial distribution with n = 15 and p = .2. 


1. The probability that at most 8 fail the test is 


8 
PX=8)= So; 15, .2) = B(8; 15, .2) 


y=0 


which is the entry in the x = 8 row and the p = .2 column of the n = 15 binomial 
table. From Appendix Table A.1, the probability is B(8; 15, .2) = .999. 


2. The probability that exactly 8 fail is 
P(X = 8) = P(X S 8) — P(X $7) = BO8; 15, .2) — BC7; 15, .2) 


which is the difference between two consecutive entries in the p = .2 column. 
The result is .999 — .996 = .003. 


* Statistical software packages such as Minitab and R will provide the pmf or cdf almost instantaneously 
upon request for any value of p and n ranging from 2 up into the millions. There is also an R command 
for calculating the probability that X lies in some interval. 
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3. The probability that at least 8 fail is 
P(X = 8) = 1 -— P(X $7) = 1 — B7; 15, .2) 


2 entry inx = 7 
row of p = .2 column 
= 1— .996 = .004 


4. Finally, the probability that between 4 and 7, inclusive, fail is 


P(4<X <7) = P(X = 4,5, 6, or7) = P(X <7) — P(X S3) 
= B(7; 15, .2) — B(3; 15, .2) = .996 — .648 = .348 


Notice that this latter probability is the difference between entries in the x = 7 and 
x = 3 rows, not the x = 7 and x = 4 rows. i 


EXAMPLE 3.32 An electronics manufacturer claims that at most 10% of its power supply units need 
service during the warranty period. To investigate this claim, technicians at a testing 
laboratory purchase 20 units and subject each one to accelerated testing to simulate 
use during the warranty period. Let p denote the probability that a power supply unit 
needs repair during the period (the proportion of all such units that need repair). The 
laboratory technicians must decide whether the data resulting from the experiment 
supports the claim that p = .10. Let X denote the number among the 20 sampled that 
need repair, so X ~ Bin(20, p). Consider the decision rule: 


Reject the claim that p = .10 in favor of the conclusion that p > .10 if x = 5 


(where x is the observed value of X), and consider the claim plausible if x = 4. 


The probability that the claim is rejected when p = .10 (an incorrect conclusion) is 
P(X 2 5 when p = .10) = 1 — B(4; 20, .1) = 1 — .957 = .043 


The probability that the claim is not rejected when p = .20 (a different type of 
incorrect conclusion) is 


P(X = 4 when p = .2) = B(4; 20, .2) = .630 


The first probability is rather small, but the second is intolerably large. When 
p = .20, so that the manufacturer has grossly understated the percentage of units that 
need service, and the stated decision rule is used, 63% of all samples will result in 
the manufacturer’s claim being judged plausible! 

One might think that the probability of this second type of erroneous conclu- 
sion could be made smaller by changing the cutoff value 5 in the decision rule to 
something else. However, although replacing 5 by a smaller number would yield a 
probability smaller than .630, the other probability would then increase. The only 
way to make both “error probabilities” small is to base the decision rule on an 
experiment involving many more units. a 


The Mean and Variance of X 


For n= 1, the binomial distribution becomes the Bernoulli distribution. From 
Example 3.18, the mean value of a Bernoulli variable is w = p, so the expected 
number of S’s on any single trial is p. Since a binomial experiment consists of 7 trials, 
intuition suggests that for X ~ Bin(n, p), E(X) = np, the product of the number of 
trials and the probability of success on a single trial. The expression for V(X) is not 
so intuitive. 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


PROPOSITION 


EXAMPLE 3.33 
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If X~ Bin, p), then E(X) = np, V(X) = np — p) = npg, and oy = Vnpq 
(where gq = 1 — p). 


Thus, calculating the mean and variance of a binomial rv does not necessitate evalu- 
ating summations. The proof of the result for E(X) is sketched in Exercise 64. 


If 75% of all purchases at a certain store are made with a credit card and X is the 
number among ten randomly selected purchases made with a credit card, then 
X ~ Bin(10, .75). Thus E(X) = np = (10)(.75) = 7.5, VX) = npg = 10(.75)(.25) = 

1.875, and 0 = V 1.875 = 1.37. Again, even though X can take on only integer val- 
ues, E(X) need not be an integer. If we perform a large number of independent bino- 
mial experiments, each with n = 10 trials and p = .75, then the average number of 


S’s per experiment will be close to 7.5. 
The probability that X is within | standard deviation of its mean value is 


P(.5 — 137<X <7.5 + 1.37) = P(6.13 < X < 8.87) 


EXERCISES Section 3.4 (46-67) 


47. 


P(X = Tor 8) = .532. 


46. Compute the following binomial probabilities directly 48. NBC News reported on May 2, 2013, that 1 in 20 chil- 
from the formula for b(x; n, p): dren in the United States have a food allergy of some 
a. b(3; 8, .35) sort. Consider selecting a random sample of 25 children 
b. b(5; 8, .6) and let X be the number in the sample who have a food 
c. P(3 =X <5) whenn = 7 and p= 6 allergy. Then X ~ Bin(25, .05). 

d. P(1 < X) whenn = 9 and p =.1 a. Determine both P(X = 3) and P(X < 3). 
The article “Should You Report That Fender- Mi oe ie 
<a c. Determine P(1 = X S 3). 
Bender?” (Consumer Reports, Sept. 2013: 15) reported 
that 7 in 10 auto accidents involve a single vehicle (the d. What are E(X) and oy? 
article recommended always reporting to the insurance e. In a sample of 50 children, what is the probability 
company an accident involving multiple vehicles). that none has a food allergy? 
Suppose 15 accidents are randomly selected. Use 49. A company that produces fine crystal knows from expe- 
Appendix Table A.1 to answer each of the following rience that 10% of its goblets have cosmetic flaws and 
questions. must be classified as “seconds.” 
a. What is the probability that at most 4 involve a single a. Among six randomly selected goblets, how likely is 
vehicle? it that only one is a second? 
b. What is the probability that exactly 4 involve a single b. Among six randomly selected goblets, what is the 
vehicle? probability that at least two are seconds? 
c. What is the probability that exactly 6 involve multi- c. If goblets are examined one by one, what is the prob- 
ple vehicles? ability that at most five must be selected to find four 
d. What is the probability that between 2 and 4, inclu- that are not seconds? 
sive, involve a single vehicle? 50. A particular telephone number is used to receive both 


e. What is the probability that at least 2 involve a single 
vehicle? 

f. What is the probability that exactly 4 involve a single 
vehicle and the other 11 involve multiple vehicles? 


voice calls and fax messages. Suppose that 25% of the 
incoming calls involve fax messages, and consider a 
sample of 25 incoming calls. What is the probability 
that 
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a. At most 6 of the calls involve a fax message? 

b. Exactly 6 of the calls involve a fax message? 

c. At least 6 of the calls involve a fax message? 

d. More than 6 of the calls involve a fax message? 

51. Refer to the previous exercise. 

a. What is the expected number of calls among the 25 
that involve a fax message? 

b. What is the standard deviation of the number among 
the 25 calls that involve a fax message? 

c. What is the probability that the number of calls among 
the 25 that involve a fax transmission exceeds the 
expected number by more than 2 standard deviations? 

52. Suppose that 30% of all students who have to buy a text 
for a particular course want a new copy (the successes!), 
whereas the other 70% want a used copy. Consider ran- 
domly selecting 25 purchasers. 

a. What are the mean value and standard deviation of 
the number who want a new copy of the book? 

b. What is the probability that the number who want 
new copies is more than two standard deviations 
away from the mean value? 

c. The bookstore has 15 new copies and 15 used 
copies in stock. If 25 people come in one by one 
to purchase this text, what is the probability that 
all 25 will get the type of book they want from 
current stock? [Hint: Let X = the number who 
want a new copy. For what values of X will all 25 
get what they want?] 

d. Suppose that new copies cost $100 and used copies 
cost $70. Assume the bookstore currently has 50 new 
copies and 50 used copies. What is the expected value 
of total revenue from the sale of the next 25 copies 
purchased? Be sure to indicate what rule of expected 
value you are using. [Hint: Let h(X) = the revenue 
when X of the 25 purchasers want new copies. Express 
this as a linear function.] 

53. Exercise 30 (Section 3.3) gave the pmf of Y, the number 
of traffic citations for a randomly selected individual 
insured by a particular company. What is the probability 
that among 15 randomly chosen such individuals 
a. At least 10 have no citations? 

b. Fewer than half have at least one citation? 

ce. The number that have at least one citation is between 
5 and 10, inclusive?* 

54. A particular type of tennis racket comes in a midsize 


version and an oversize version. Sixty percent of all cus- 

tomers at a certain store want the oversize version. 

a. Among ten randomly selected customers who want 
this type of racket, what is the probability that at 
least six want the oversize version? 

b. Among ten randomly selected customers, what is the 
probability that the number who want the oversize 


* “Between a and J, inclusive” is equivalent to (a = X = b). 
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56. 


57. 


58. 


version is within | standard deviation of the mean 
value? 

c. The store currently has seven rackets of each version. 
What is the probability that all of the next ten cus- 
tomers who want this racket can get the version they 
want from current stock? 


Twenty percent of all telephones of a certain type are 
submitted for service while under warranty. Of these, 
60% can be repaired, whereas the other 40% must be 
replaced with new units. If a company purchases ten of 
these telephones, what is the probability that exactly two 
will end up being replaced under warranty? 


The College Board reports that 2% of the 2 million high 
school students who take the SAT each year receive 
special accommodations because of documented dis- 
abilities (Los Angeles Times, July 16, 2002). Consider 
a random sample of 25 students who have recently 
taken the test. 

a. What is the probability that exactly 1 received a spe- 
cial accommodation? 

b. What is the probability that at least 1 received a spe- 
cial accommodation? 

c. What is the probability that at least 2 received a spe- 
cial accommodation? 

d. What is the probability that the number among the 25 
who received a special accommodation is within 2 
standard deviations of the number you would expect 
to be accommodated? 

e. Suppose that a student who does not receive a special 
accommodation is allowed 3 hours for the exam, 
whereas an accommodated student is allowed 
4.5 hours. What would you expect the average time 
allowed the 25 selected students to be? 


A certain type of flashlight requires two type-D batter- 
ies, and the flashlight will work only if both its batteries 
have acceptable voltages. Suppose that 90% of all batter- 
ies from a certain supplier have acceptable voltages. 
Among ten randomly selected flashlights, what is the 
probability that at least nine will work? What assump- 
tions did you make in the course of answering the ques- 
tion posed? 


A very large batch of components has arrived at a 
distributor. The batch can be characterized as accept- 
able only if the proportion of defective components is 
at most .10. The distributor decides to randomly 
select 10 components and to accept the batch only if 
the number of defective components in the sample is 
at most 2. 

a. What is the probability that the batch will be 
accepted when the actual proportion of defectives is 
01? .05? .10? .20? .25? 

b. Let p denote the actual proportion of defectives in 
the batch. A graph of P(batch is accepted) as a func- 
tion of p, with p on the horizontal axis and P(batch 


59. 


60. 


61. 


62. 


is accepted) on the vertical axis, is called the operat- 
ing characteristic curve for the acceptance sampling 
plan. Use the results of part (a) to sketch this curve 
forO =p <1. 

c. Repeat parts (a) and (b) with “1” replacing “2” in the 
acceptance sampling plan. 

d. Repeat parts (a) and (b) with “15” replacing “10” in 
the acceptance sampling plan. 

e. Which of the three sampling plans, that of part (a), 
(c), or (d), appears most satisfactory, and why? 


An ordinance requiring that a smoke detector be 
installed in all previously constructed houses has been 
in effect in a particular city for 1 year. The fire depart- 
ment is concerned that many houses remain without 
detectors. Let p = the true proportion of such houses 
having detectors, and suppose that a random sample of 
25 homes is inspected. If the sample strongly indicates 
that fewer than 80% of all houses have a detector, the 
fire department will campaign for a mandatory inspec- 
tion program. Because of the costliness of the program, 
the department prefers not to call for such inspections 
unless sample evidence strongly argues for their neces- 
sity. Let X denote the number of homes with detectors 
among the 25 sampled. Consider rejecting the claim that 
p= 8ifx=15. 
a. What is the probability that the claim is rejected 
when the actual value of p is .8? 
b. What is the probability of not rejecting the claim 
when p = .7? When p = .6? 
c. How do the “error probabilities” of parts (a) and (b) 
change if the value 15 in the decision rule is replaced 
by 14? 


A toll bridge charges $1.00 for passenger cars and $2.50 
for other vehicles. Suppose that during daytime 
hours, 60% of all vehicles are passenger cars. If 25 vehi- 
cles cross the bridge during a particular daytime period, 
what is the resulting expected toll revenue? [Hint: Let 
X = the number of passenger cars; then the toll revenue 
h(X) is a linear function of X.] 


A student who is trying to write a paper for a course 
has a choice of two topics, A and B. If topic A is cho- 
sen, the student will order two books through interli- 
brary loan, whereas if topic B is chosen, the student 
will order four books. The student believes that a good 
paper necessitates receiving and using at least half the 
books ordered for either topic chosen. If the probabil- 
ity that a book ordered through interlibrary loan 
actually arrives in time is .9 and books arrive indepen- 
dently of one another, which topic should the student 
choose to maximize the probability of writing a good 
paper? What if the arrival probability is only .5 instead 
of .9? 


a. For fixed n, are there values of p(0 = p = 1) for which 
V(X) = 0? Explain why this is so. 


63. 


64. 


65. 


66. 


67. 
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b. For what value of p is V(X) maximized? [Hint: 
Either graph V(X) as a function of p or else take a 
derivative. | 


a. Show that b(x; n, 1 — p) = b(n — x; n, p). 

b. Show that Bix; n, 1 — p) = 1—-— Bin-—x- 15n,p). 
[Hint: At most x S’s is equivalent to at least (n — x) 
F’s.] 

c. What do parts (a) and (b) imply about the necessity of 
including values of p greater than .5 in Appendix 
Table A.1? 


Show that E(X) = np when X is a binomial random 
variable. [Hint: First express E(X) as a sum with lower 
limit x = 1. Then factor out np, let y = x — 1 so that the 
sum is from y = 0 to y =n — 1, and show that the sum 
equals 1.] 


Customers at a gas station pay with a credit card (A), 
debit card (B), or cash (C). Assume that successive cus- 
tomers make independent choices, with P(A) =.5, 
P(B) = .2, and P(C) = .3. 

a. Among the next 100 customers, what are the mean 
and variance of the number who pay with a debit 
card? Explain your reasoning. 

b. Answer part (a) for the number among the 100 who 
don’t pay with cash. 


An airport limousine can accommodate up to four passen- 
gers on any one trip. The company will accept a maximum 
of six reservations for a trip, and a passenger must have a 
reservation. From previous records, 20% of all those mak- 
ing reservations do not appear for the trip. Answer the 
following questions, assuming independence wherever 
appropriate. 

a. If six reservations are made, what is the probability 
that at least one individual with a reservation cannot 
be accommodated on the trip? 

b. If six reservations are made, what is the expected 
number of available places when the limousine 
departs? 

c. Suppose the probability distribution of the number of 
reservations made is given in the accompanying 
table. 


Number of reservations | 3 4 5 6 
Probability ze 


Let X denote the number of passengers on a randomly 
selected trip. Obtain the probability mass function 
of X. 


Refer to Chebyshev’s inequality given in Exercise 44. 
Calculate P(|X — | = ko) for k = 2 and k = 3 when 
X ~ Bin(20, .5), and compare to the corresponding 
upper bound. Repeat for X ~ Bin(20, .75). 
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3.5 Hypergeometric and Negative 


Binomial Distributions 


The hypergeometric and negative binomial distributions are both related to the binomial 
distribution. The binomial distribution is the approximate probability model for sampling 
without replacement from a finite dichotomous (S—F’) population provided the sample size 
n is small relative to the population size N; the hypergeometric distribution is the exact 
probability model for the number of S’s in the sample. The binomial rv X is the number 
of S’s when the number n of trials is fixed, whereas the negative binomial distribution 
arises from fixing the number of S’s desired and letting the number of trials be random. 


The Hypergeometric Distribution 
The assumptions leading to the hypergeometric distribution are as follows: 


1. The population or set to be sampled consists of N individuals, objects, or 
elements (a finite population). 

2. Each individual can be characterized as a success (S) or a failure (F), and there 
are M successes in the population. 


3. A sample of n individuals is selected without replacement in such a way that 
each subset of size n is equally likely to be chosen. 


The random variable of interest is X = the number of S’s in the sample. The 
probability distribution of X depends on the parameters n, M, and N, so we wish to 
obtain P(X = x) = h(x; n, M, N). 


EXAMPLE 3.34 During a particular period a university’s information technology office received 20 
service orders for problems with printers, of which 8 were laser printers and 12 were 
inkjet models. A sample of 5 of these service orders is to be selected for inclusion 
in a customer satisfaction survey. Suppose that the 5 are selected in a completely 
random fashion, so that any particular subset of size 5 has the same chance of 
being selected as does any other subset. What then is the probability that exactly 
x (x = 0, 1, 2, 3, 4, or 5) of the selected service orders were for inkjet printers? 

Here, the population size is N = 20, the sample size is n = 5, and the num- 
ber of S’s (inkjet = S) and F’s in the population are M = 12 and N— M=8, 
respectively. Consider the value x = 2. Because all outcomes (each consisting of 5 
particular orders) are equally likely, 


number of outcomes having X = 2 


PX = 2) = hQ; 5, 12, 20) = : 
( ) ( ) number of possible outcomes 
The number of possible outcomes in the experiment is the number of ways of 
selecting 5 from the 20 service orders without regard to order—that is, (ak To 
count the number of outcomes having X = 2, note that there are (3) ways of select- 
ing 2 of the inkjet orders, and for each such way there are es) ways of selecting 
the 3 laser orders to fill out the sample. The product rule from Chapter 2 then gives 
(3)(8) as the number of outcomes with X = 2, so 

12\/8 

2 /\3 7 


h(2; 5, 12, 20) = = = .238 a 
( ) 20 323 
5 
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In general, if the sample size is smaller than the number of successes in the 
population (M), then the largest possible X value is n. However, if M < n (e.g., a sample 
size of 25 and only 15 successes in the population), then X can be at most M. Similarly, 
whenever the number of population failures (V — M) exceeds the sample size, the 
smallest possible X value is 0 (since all sampled individuals might then be failures). 
However, if N — M < n, the smallest possible X value isn — (N — M). Thus, the pos- 
sible values of X satisfy the restriction max (0, n — (VN — M)) =x = min (n, M). An 
argument parallel to that of the previous example gives the pmf of X. 


PROPOSITION If X is the number of S’s in a completely random sample of size n drawn from 
a population consisting of M S’s and (N — M) F’s, then the probability distri- 
bution of X, called the hypergeometric distribution, is given by 


SN ee Se 
P(X = x) = h(x; n, M, N) = ———_—_——_ (xls) 


() 


for x an integer satisfying max (0,n - N+ M) =x min(n, M). 


In Example 3.34, n = 5, M = 12, and N = 20, so h(x; 5, 12, 20) for x = 0, 1, 2, 
3, 4, 5 can be obtained by substituting these numbers into Equation (3.15). 


EXAMPLE 3.35 Five individuals from an animal population thought to be near extinction in a certain 
region have been caught, tagged, and released to mix into the population. After they 
have had an opportunity to mix, a random sample of 10 of these animals is selected. 
Let X = the number of tagged animals in the second sample. Suppose there are actu- 
ally 25 animals of this type in the region. 

The parameter values are n = 10, M = 5 (5 tagged animals in the population), 


and N = 25, so the pmf of X is 
(")( a 
x}/\10 —x 
h(x; 10, 5, 25) = x = 0,1, 2,3,4,5 
25 
10 


The probability that exactly two of the animals in the second sample are tagged is 


2)\ 8 
P(X = 2) = hQ; 10, 5, 25) = ——— = .385 


The probability that at most two of the animals in the recapture sample are tagged is 


2 
P(X = 2) = P(X = 0, 1, or 2) = Shes 10, 5, 25) 


x=0 


057 + .257 + .385 = .699 ia 
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Various statistical software packages will easily generate hypergeometric 
probabilities (tabulation is cumbersome because of the three parameters). 

As in the binomial case, there are simple expressions for E(X) and V(X) for 
hypergeometric rv’s. 


PROPOSITION The mean and variance of the hypergeometric rv X having pmf h(x; n, M, N) are 


EQ) =n veo = (¥=2) nM (1-4) 


Nia N N 


The ratio M/N is the proportion of S’s in the population. Replacing M/N by p 


in E(X) and V(X) gives 


E(X) = np 
N- 
V(X) = (7) - np( — p) (3.16) 


Expression (3.16) shows that the means of the binomial and hypergeometric rv’s are 
equal, whereas the variances of the two rv’s differ by the factor (V — n)/(N — 1), 
often called the finite population correction factor. This factor is less than 1, so 
the hypergeometric variable has smaller variance than does the binomial rv. The 
correction factor can be written as (1 — n/N)/(1 — 1/N), which is approximately 1 
when n is small relative to N. 


EXAMPLE 3.36 In the animal-tagging example, n = 10, M = 5, and N = 25, so p = 5/25 = .2 and 


(Example 5:55 E(X) = 10(.2) = 2 
continued) 


15 
V(X) = 74 (10)(.2)(.8) = (.625)(1.6) = 1 


If the sampling had been carried out with replacement, V(X) = 1.6. 

Suppose the population size N is not actually known, so the value x is observed 
and we wish to estimate N. It is reasonable to equate the observed sample proportion 
of S’s, x/n, with the population proportion, M/N, giving the estimate 


~ M-n 
N= 
x 
If M = 100, n = 40, and x = 16, then N = 250. te 


Our general rule of thumb in Section 3.4 stated that if sampling was without 
replacement but n/N was at most .05, then the binomial distribution could be used 
to compute approximate probabilities involving the number of S’s in the sample. 
A more precise statement is as follows: Let the population size, N, and number of 
population S’s, M, get large with the ratio M/N approaching p. Then h(x; n, M, N) 
approaches b(x; n, p); so for n/N small, the two are approximately equal provided 
that p is not too near either O or 1. This is the rationale for the rule. 


The Negative Binomial Distribution 


The negative binomial rv and distribution are based on an experiment satisfying the 
following conditions: 


1. The experiment consists of a sequence of independent trials. 


2. Each trial can result in either a success (S) or a failure (F). 
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3. The probability of success is constant from trial to trial, so P(S on trial i) = p 
fori = 1, 2, 3,.... 

4. The experiment continues (trials are performed) until a total of r successes have 
been observed, where r is a specified positive integer. 


The random variable of interest is X = the number of failures that precede the rth 
success; X is called a negative binomial random variable because, in contrast 
to the binomial rv, the number of successes is fixed and the number of trials is 
random. 

Possible values of X are 0, 1, 2,.... Let nb(x; 4 p) denote the pmf of X. 
Consider nb(7; 3, p) = P(X = 7), the probability that exactly 7 F’s occur before the 
34 §. In order for this to happen, the 10" trial must be an S and there must be exactly 
2 S’s among the first 9 trials. Thus 


9 9 
nb(7; 3, p) = {(3) “pu rf as (3) pi Spy 


Generalizing this line of reasoning gives the following formula for the negative 
binomial pmf. 


PROPOSITION The pmf of the negative binomial rv X with parameters r = number of S’s and 
p = P(S)is 


all 


eat 
nb(x; r, p) = ee pil p)e a ON eo 


EXAMPLE 3.37 A pediatrician wishes to recruit 5 couples, each of whom is expecting their first 
child, to participate in a new natural childbirth regimen. Let p = P(a randomly 
selected couple agrees to participate). If p = .2, what is the probability that 15 cou- 
ples must be asked before 5 are found who agree to participate? That is, with 
S = {agrees to participate}, what is the probability that 10 F’s occur before the fifth 
S? Substituting r = 5, p = .2, and x = 10 into nb(x; 7, p) gives 


14 
nb(10; 5, .2) = ; Jara = 034 


The probability that at most 10 F’s are observed (at most 15 couples are asked) is 


10 7 
P(X = 10) = D)nb(x; 5, .2) = ry" : ; 


x=0 x=0 


Jos = 164 a 


In some sources, the negative binomial rv is taken to be the number of trials 
X + rrather than the number of failures. 
In the special case r = 1, the pmf is 


nb(x; 1,p) =U — pyp x=0, 1, 2,... (3.17) 


In Example 3.12, we derived the pmf for the number of trials necessary to obtain the 
first S, and the pmf there is similar to Expression (3.17). Both X = number of F’s 
and Y = number of trials (= | + X) are referred to in the literature as geometric 
random variables, and the pmf in Expression (3.17) is called the geometric 
distribution. 
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The expected number of trials until the first S was shown in Example 3.19 to 
be 1/p, so that the expected number of F’s until the first S is (1/p) — 1 = (1 — p)/p. 
Intuitively, we would expect to see r- (1 — p)/pF’s before the rth S, and this is 
indeed E(X). There is also a simple formula for V(X). 


PROPOSITION 


If X is a negative binomial rv with pmf nb(x; r, p), then 


ACL! — jh) 
E(X) = ——— 
P 


a rl — p) 


a 


Finally, by expanding the binomial coefficient in front of p’(1 — p)* and doing some 
cancellation, it can be seen that nb(x; 1, p) is well defined even when r is not an inte- 
ger. This generalized negative binomial distribution has been found to fit observed 
data quite well in a wide variety of applications. 


EXERCISES Section 3.5 (68-78) 


68. 


69. 


70. 


Eighteen individuals are scheduled to take a driving test 
at a particular DMV office on a certain day, eight of 
whom will be taking the test for the first time. Suppose 
that six of these individuals are randomly assigned to a 
particular examiner, and let X be the number among the 
six who are taking the test for the first time. 

a. What kind of a distribution does X have (name and 

values of all parameters)? 
b. Compute P(X = 2), P(X = 2), and P(X = 2). 
c. Calculate the mean value and standard deviation of X. 


Each of 12 refrigerators of a certain type has been 
returned to a distributor because of an audible, high- 
pitched, oscillating noise when the refrigerators are run- 
ning. Suppose that 7 of these refrigerators have a defec- 
tive compressor and the other 5 have less serious prob- 
lems. If the refrigerators are examined in random order, 
let X be the number among the first 6 examined that have 

a defective compressor. 

a. Calculate P(X = 4) and P(X = 4) 

b. Determine the probability that X exceeds its mean 
value by more than | standard deviation. 

c. Consider a large shipment of 400 refrigerators, of 
which 40 have defective compressors. If X is the 
number among 15 randomly selected refrigerators 
that have defective compressors, describe a less 
tedious way to calculate (at least approximately) 
P(X = 5) than to use the hypergeometric pmf. 


An instructor who taught two sections of engineering 
statistics last term, the first with 20 students and the second 
with 30, decided to assign a term project. After all projects 
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72. 


had been turned in, the instructor randomly ordered them 

before grading. Consider the first 15 graded projects. 

a. What is the probability that exactly 10 of these are 
from the second section? 

b. What is the probability that at least 10 of these are 
from the second section? 

c. What is the probability that at least 10 of these are 
from the same section? 

d. What are the mean value and standard deviation of 
the number among these 15 that are from the second 
section? 

e. What are the mean value and standard deviation of 
the number of projects not among these first 15 that 
are from the second section? 


A geologist has collected 10 specimens of basaltic rock 

and 10 specimens of granite. The geologist instructs a 

laboratory assistant to randomly select 15 of the speci- 

mens for analysis. 

a. What is the pmf of the number of granite specimens 
selected for analysis? 

b. What is the probability that all specimens of one of 
the two types of rock are selected for analysis? 

c. What is the probability that the number of granite 
specimens selected for analysis is within 1 standard 
deviation of its mean value? 


A personnel director interviewing 11 senior engineers 
for four job openings has scheduled six interviews for the 
first day and five for the second day of interviewing. 
Assume that the candidates are interviewed in random 
order. 


73. 


74. 


75. 


a. What is the probability that x of the top four candi- 
dates are interviewed on the first day? 

b. How many of the top four candidates can be 
expected to be interviewed on the first day? 


Twenty pairs of individuals playing in a bridge tourna- 

ment have been seeded 1,..., 20. In the first part of the 

tournament, the 20 are randomly divided into 10 east— 

west pairs and 10 north-south pairs. 

a. What is the probability that x of the top 10 pairs end 
up playing east-west? 

b. What is the probability that all of the top five pairs 
end up playing the same direction? 

c. Ifthere are 2n pairs, whatis the pmf of X = the number 
among the top 7 pairs who end up playing east-west? 
What are E(X) and V(X)? 


A second-stage smog alert has been called in a certain 
area of Los Angeles County in which there are 50 indus- 
trial firms. An inspector will visit 10 randomly selected 
firms to check for violations of regulations. 

a. If 15 of the firms are actually violating at least one 
regulation, what is the pmf of the number of firms 
visited by the inspector that are in violation of at 
least one regulation? 

b. If there are 500 firms in the area, of which 150 are in 
violation, approximate the pmf of part (a) by a sim- 
pler pmf. 

c. For X = the number among the 10 visited that are in 
violation, compute E(X) and V(X) both for the exact 
pmf and the approximating pmf in part (b). 


The probability that a randomly selected box of a certain 
type of cereal has a particular prize is .2. Suppose you 
purchase box after box until you have obtained two of 
these prizes. 


76. 


77. 


78. 


3.6 The Poisson Probability Distribution 131 


a. What is the probability that you purchase x boxes 
that do not have the desired prize? 

b. What is the probability that you purchase four 
boxes? 

c. What is the probability that you purchase at most 
four boxes? 

d. How many boxes without the desired prize do you 
expect to purchase? How many boxes do you expect 
to purchase? 


A family decides to have children until it has three chil- 
dren of the same gender. Assuming P(B) = P(G) = .5, 
what is the pmf of X = the number of children in the 
family? 


Three brothers and their wives decide to have children 
until each family has two female children. What is the pmf 
of X = the total number of male children born to the 
brothers? What is E(X), and how does it compare to the 
expected number of male children born to each brother? 


According to the article “Characterizing the 
Severity and Risk of Drought in the Poudre River, 
Colorado” (J. of Water Res. Planning and Mgmnt., 
2005: 383-393), the drought length Y is the number 
of consecutive time intervals in which the water sup- 
ply remains below a critical value y, (a deficit), pre- 
ceded by and followed by periods in which the supply 
exceeds this critical value (a surplus). The cited paper 
proposes a geometric distribution with p = .409 for 
this random variable. 
a. What is the probability that a drought lasts exactly 
3 intervals? At most 3 intervals? 
b. What is the probability that the length of a drought 
exceeds its mean value by at least one standard 
deviation? 


5.6 The Poisson Probability Distribution 


The binomial, hypergeometric, and negative binomial distributions were all derived 
by starting with an experiment consisting of trials or draws and applying the laws of 
probability to various outcomes of the experiment. There is no simple experiment on 
which the Poisson distribution is based, though we will shortly describe how it can 
be obtained by certain limiting operations. 


DEFINITION 


A discrete random variable X is said to have a Poisson distribution with 
parameter pz (wu > 0) if the pmf of X is 


pb. yx 


jess [) = E sp (0), 1, Os Se 
iol 
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It is no accident that we are using the symbol pw for the Poisson parameter; we shall 
see shortly that ys is in fact the expected value of X. The letter e in the pmf represents 
the base of the natural logarithm system; its numerical value is approximately 2.71828. 
In contrast to the binomial and hypergeometric distributions, the Poisson distribution 
spreads probability over a// non-negative integers, an infinite number of possibilities. 
It is not obvious by inspection that p(x; 2) specifies a legitimate pmf, let alone 
that this distribution is useful. First of all, p(x; w) > 0 for every possible x value 
because of the requirement that 4p > 0. The fact that =p(x; w) = | is a consequence 
of the Maclaurin series expansion of e“ (check your calculus book for this result): 


3 


— { ye Me ec — : pe 
e=1+y4 a a peer (3.18) 


If the two extreme terms in (3.18) are multiplied by e~“ and then this quantity is 
moved inside the summation on the far right, the result is 


eo ce 
i= x! 


Appendix Table A.2 contains the Poisson cdf F(x; ) for w = «1, .2,..., 1, 2,..., 10, 
15, and 20. Alternatively, many software packages will provide F(x; jw) and p(x; 2) 
upon request. 


EXAMPLE 3.38 Let X denote the number of traps (defects of a certain kind) in a particular type of 
metal oxide semiconductor transistor, and suppose it has a Poisson distribution with w 
= 2 (the Poisson model is suggested in the article “Analysis of Random Telegraph 
Noise in 45-nm CMOS Using On-Chip Characterization System ”(IEEE Trans. 
on Electron Devices, 2013: 1716-1722); we changed the value of the parameter for 
computational ease). 

The probability that there are exactly three traps is 


e223 
P(X = 3) = p32) = 31 = .180, 
and the probability that there are at most three traps is 
3 —29x 
P(X = 3) = FG; 2) = > = 135 + .271 + .271 + .180 = .857 
x=0 


This latter cumulative probability is found at the intersection of the w = 2 column 
and the x = 3 row of Appendix Table A.2, whereas p(3;2) = F(3;2) — F(2;2) = 
.857 — .677 = .180, the difference between two consecutive entries in the w = 2 
column of the cumulative Poisson table. w 


The Poisson Distribution as a Limit 


The rationale for using the Poisson distribution in many situations is provided by the 
following proposition. 


PROPOSITION Suppose that in the binomial pmf b(x; n, p), we let n — © and p — 0 in such 
a way that np approaches a value pp > 0. Then b(x; n, p) > p(x; pw). 


According to this result, in any binomial experiment in which n is large and p 
is small, b(x; n, p) ~ p(x; &), where jw = np. As a rule of thumb, this approximation 
can safely be applied if n > 50 and np <5. 
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EXAMPLE 3.39 If a publisher of nontechnical books takes great pains to ensure that its books are free 
of typographical errors, so that the probability of any given page containing at least one 
such error is .005 and errors are independent from page to page, what is the probability 
that one of its 600-page novels will contain exactly one page with errors? At most three 
pages with errors? 

With S denoting a page containing at least one error and F an error-free page, 
the number X of pages containing at least one error is a binomial rv with n = 600 
and p = .005, so np = 3. We wish 


e3(3)! 
P(X = 1) = b(1; 600, .005) ~ pi; 3) = 7 14936 
The binomial value is b(1; 600, .005) = .14899, so the approximation is very good. 
Similarly, 
3 
P(X = 3) = Nee 3) = F@G;3) = .647 
x=0 
which to three-decimal-place accuracy is identical to B(3; 600, .005). |_| 


Table 3.2 shows the Poisson distribution for w = 3 along with three bino- 
mial distributions with np = 3, and Figure 3.8 plots the Poisson along with the 
first two binomial distributions. The approximation is of limited use for n = 30, 
but of course the accuracy is better for n = 100 and much better for n = 300. 


Table 3.2 Comparing the Poisson and Three Binomial Distributions 


x n= 30,p =.1 n = 100, p = .03 n = 300, p = .O1 Poisson, pw = 3 
0 0.042391 0.047553 0.049041 0.049787 
1 0.141304 0.147070 0.148609 0.149361 
2 0.227656 0.225153 0.224414 0.224042 
3 0.236088 0.227474 0.225170 0.224042 
4 0.177066 0.170606 0.168877 0.168031 
5 0.102305 0.101308 0.100985 0.100819 
6 0.047363 0.049610 0.050153 0.050409 
7 0.018043 0.020604 0.021277 0.021604 
8 0.005764 0.007408 0.007871 0.008102 
9 0.001565 0.002342 0.002580 0.002701 

10 0.000365 0.000659 0.000758 0.000810 

ny) Bin, n=30 (0); Bin, n=100 (x); Poisson (|) 
25 - 

9g 
20 - 
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104 } 
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Figure 3.8 Comparing a Poisson and two binomial distributions 
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The Mean and Variance of X 


Since b(x; n, p) > p(x; w) as n—> %, p—> 0, np — p, the mean and variance of 
a binomial variable should approach those of a Poisson variable. These limits are 
np — wand np(1 — p) > p. 


PROPOSITION If X has a Poisson distribution with parameter pw, then E(X) = V(X) = wp. 


These results can also be derived directly from the definitions of mean and variance. 


EXAMPLE 3.40 Both the expected number of traps and the variance of the number of traps equal 2, 
(Example 3.38 andoy = Vu = V2 = 1.414. a 
continued) 


The Poisson Process 


A very important application of the Poisson distribution arises in connection with 
the occurrence of events of some type over time. Events of interest might be visits 
to a particular Web site, pulses of some sort recorded by a counter, email messages 
sent to a particular address, accidents in an industrial facility, or cosmic ray showers 
observed by astronomers at a particular observatory. We make the following assump- 
tions about the way in which the events of interest occur: 


1. There exists a parameter a > 0 such that for any short time interval of length 
At, the probability that exactly one event occurs is a + At + o(Ar)* 


2. The probability of more than one event occurring during At is o(Af) [which, 
along with Assumption 1, implies that the probability of no events during At is 
1—a: At— o(Ad). 

3. The number of events occurring during the time interval At is independent of 
the number that occur prior to this time interval. 


Informally, Assumption | says that for a short interval of time, the probability of a 
single event occurring is approximately proportional to the length of the time inter- 
val, where a is the constant of proportionality. Now let Pk(t) denote the probability 
that k events will be observed during any particular time interval of length ¢. 


PROPOSITION P(t) =e" - (at)‘/k!, so that the number of events during a time interval of 
length ft is a Poisson rv with parameter jz = at. The expected number of events 
during any such time interval is then at, so the expected number during a unit 
interval of time is a. 


The occurrence of events over time as described is called a Poisson process; the 
parameter a specifies the rate for the process. 


EXAMPLE 3.41 Suppose pulses arrive at a counter at an average rate of six per minute, so that a = 6. 
To find the probability that in a .5-min interval at least one pulse is received, note that 
the number of pulses in such an interval has a Poisson distribution with parameter 


* A quantity is o(Ad) (read “‘ittle o of delta 7”) if, as At approaches 0, so does o(Ar)/At. That is, o(At) is even 
more negligible (approaches 0 faster) than Ar itself. The quantity (Af) has this property, but sin(Af) does not. 
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at = 6(.5) = 3 (.5 min is used because a is expressed as a rate per minute). Then with 
X = the number of pulses received in the 30-sec interval, 


PU<X)=1 


e3(3)° _ 


P(X =0)=1 950 a 


0! 


Instead of observing events over time, consider observing events of some type 
that occur in a two- or three-dimensional region. For example, we might select on a 
map a certain region R of a forest, go to that region, and count the number of trees. 
Each tree would represent an event occurring at a particular point in space. Under 
assumptions similar to 1-3, it can be shown that the number of events occurring 
in a region R has a Poisson distribution with parameter a - a(R), where a(R) is the 
area of R. The quantity a is the expected number of events per unit area or volume. 


EXERCISES Section 3.6 (79-93) 


The article “Expectation Analysis of the Probability of 

Failure for Water Supply Pipes” (J. of Pipeline 

Systems Engr. and Practice, May 2012: 36-46) pro- 

posed using the Poisson distribution to model the num- 

ber of failures in pipelines of various types. Suppose that 

for cast-iron pipe of a particular length, the expected 

number of failures is 1 (very close to one of the cases 

considered in the article). Then X, the number of failures, 

has a Poisson distribution with p = 1. 

a. Obtain P(X = 5) by using Appendix Table A.2. 

b. Determine P(X = 2) first from the pmf formula and 
then from Appendix Table A.2. 

ce. Determine P(2 = X = 4). 

d. What is the probability that X exceeds its mean 
value by more than one standard deviation? 


Let X be the number of material anomalies occurring in 

a particular region of an aircraft gas-turbine disk. The 

article “Methodology for Probabilistic Life Prediction 

of Multiple-Anomaly Materials” (Amer. Inst. of 

Aeronautics and Astronautics J., 2006: 787-793) pro- 

poses a Poisson distribution for X. Suppose that uw = 4. 

a. Compute both P(X = 4) and P(X < 4). 

b. Compute P(4 = X S 8). 

c. Compute P(8 = X). 

d. What is the probability that the number of anomalies 
exceeds its mean value by no more than one standard 
deviation? 


Suppose that the number of drivers who travel between a 
particular origin and destination during a designated time 
period has a Poisson distribution with parameter = 20 
(suggested in the article “Dynamic Ride Sharing: Theory 
and Practice,” J. of Transp. Engr., 1997: 308-312). What 
is the probability that the number of drivers will 

a. Be at most 10? 

b. Exceed 20? 


82. 


83. 


84. 


c. Be between 10 and 20, inclusive? Be strictly between 
10 and 20? 
d. Be within 2 standard deviations of the mean value? 


Consider writing onto a computer disk and then sending 

it through a certifier that counts the number of missing 

pulses. Suppose this number X has a Poisson distribu- 

tion with parameter uw = .2. (Suggested in “Average 

Sample Number for Semi-Curtailed Sampling Using 

the Poisson Distribution,” J. Quality Technology, 

1983: 126-129.) 

a. What is the probability that a disk has exactly one 
missing pulse? 

b. What is the probability that a disk has at least two 
missing pulses? 

c. If two disks are independently selected, what is the 
probability that neither contains a missing pulse? 


An article in the Los Angeles Times (Dec. 3, 1993) 
reports that 1 in 200 people carry the defective gene that 
causes inherited colon cancer. In a sample of 1000 indi- 
viduals, what is the approximate distribution of the num- 
ber who carry this gene? Use this distribution to calculate 
the approximate probability that 

a. Between 5 and 8 (inclusive) carry the gene. 

b. At least 8 carry the gene. 


The Centers for Disease Control and Prevention 
reported in 2012 that 1 in 88 American children had 
been diagnosed with an autism spectrum disorder 
(ASD). 

a. If a random sample of 200 American children is 
selected, what are the expected value and standard 
deviation of the number who have been diagnosed with 
ASD? 

b. Referring back to (a), calculate the approximate 
probability that at least 2 children in the sample have 
been diagnosed with ASD? 
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85. 


86. 


87. 


88. 
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c. If the sample size is 352, what is the approximate 
probability that fewer than 5 of the selected children 
have been diagnosed with ASD? 


Suppose small aircraft arrive at a certain airport accord- 

ing to a Poisson process with rate a = 8 per hour, so that 

the number of arrivals during a time period of t hours is 

a Poisson rv with parameter pz = 8t. 

a. What is the probability that exactly 6 small aircraft 
arrive during a 1-hour period? At least 6? At least 10? 

b. What are the expected value and standard deviation 
of the number of small aircraft that arrive during a 
90-min period? 

c. What is the probability that at least 20 small air- 
craft arrive during a 2.5-hour period? That at most 
10 arrive during this period? 


Organisms are present in ballast water discharged from 

a ship according to a Poisson process with a concentra- 

tion of 10 organisms/m:? [the article “Counting at Low 

Concentrations: The Statistical Challenges of 

Verifying Ballast Water Discharge Standards” 

(Ecological Applications, 2013: 339-351) considers 

using the Poisson process for this purpose]. 

a. What is the probability that one cubic meter of dis- 
charge contains at least 8 organisms? 

b. What is the probability that the number of organisms 
in 1.5 m? of discharge exceeds its mean value by 
more than one standard deviation? 

c. For what amount of discharge would the probability 
of containing at least 1 organism be .999? 


The number of requests for assistance received by a tow- 

ing service is a Poisson process with rate a = 4 per hour. 

a. Compute the probability that exactly ten requests are 
received during a particular 2-hour period. 

b. If the operators of the towing service take a 30-min 
break for lunch, what is the probability that they do 
not miss any calls for assistance? 

c. How many calls would you expect during their 
break? 


In proof testing of circuit boards, the probability that any 
particular diode will fail is .01. Suppose a circuit board 
contains 200 diodes. 

a. How many diodes would you expect to fail, and what 
is the standard deviation of the number that are 
expected to fail? 

b. What is the (approximate) probability that at least 
four diodes will fail on a randomly selected board? 

c. If five boards are shipped to a particular customer, how 
likely is it that at least four of them will work prop- 
erly? (A board works properly only if all its diodes 
work.) 


The article “Reliability-Based Service-Life Assessment 
of Aging Concrete Structures” (J. Structural Engr., 
1993: 1600-1621) suggests that a Poisson process can be 
used to represent the occurrence of structural loads over 
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time. Suppose the mean time between occurrences of loads 

is .5 year. 

a. How many loads can be expected to occur during a 
2-year period? 

b. What is the probability that more than five loads 
occur during a 2-year period? 

c. How long must a time period be so that the probability 
of no loads occurring during that period is at most .1? 


Let X have a Poisson distribution with parameter p. 
Show that E(X) = mw directly from the definition of 
expected value. [Hint: The first term in the sum equals 0, 
and then x can be canceled. Now factor out and show 
that what is left sums to 1.] 


Suppose that trees are distributed in a forest according to 
a two-dimensional Poisson process with parameter a, the 
expected number of trees per acre, equal to 80. 

a. What is the probability that in a certain quarter-acre 
plot, there will be at most 16 trees? 

b. Ifthe forest covers 85,000 acres, what is the expected 
number of trees in the forest? 

c. Suppose you select a point in the forest and construct 
a circle of radius .1 mile. Let X = the number of 
trees within that circular region. What is the pmf of 
X? [Hint: 1 sq mile = 640 acres.] 


Automobiles arrive at a vehicle equipment inspection sta- 

tion according to a Poisson process with rate a = 10 per 

hour. Suppose that with probability .5 an arriving vehicle 
will have no equipment violations. 

a. What is the probability that exactly ten arrive during 
the hour and all ten have no violations? 

b. For any fixed y = 10, what is the probability that y 
arrive during the hour, of which ten have no violations? 

c. What is the probability that ten “no-violation” cars 
arrive during the next hour? [Hint: Sum the probabil- 
ities in part (b) from y = 10 to ~.] 

a. Ina Poisson process, what has to happen in both the 
time interval (0, t) and the interval (t, t + Az) so that 
no events occur in the entire interval (0, t + At)? Use 
this and Assumptions 1-3 to write a relationship 
between P)(t + Ar) and P(t). 

b. Use the result of part (a) to write an expression for 
the difference P)(t + At) — P,(t). Then divide by At 
and let At—>0 to obtain an equation involving 
(d/dt)P,(t), the derivative of P,(t) with respect to t. 

c. Verify that P,(t) =e satisfies the equation of 
part (b). 

d. It can be shown in a manner similar to parts (a) and (b) 
that the P,(t)s must satisfy the system of differential 
equations 


d 
yo = aP,_() — aP,() 
k= 1, 2,3,... 


Verify that P,(t) = e“(at)‘/k! satisfies the system. 
(This is actually the only solution.) 
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SUPPLEMENTARY EXERCISES (94-122) 


94. 


95. 


96. 


97. 


98. 


Consider a deck consisting of seven cards, marked 1, 2,..., 
7. Three of these cards are selected at random. Define an 
tv W by W = the sum of the resulting numbers, and com- 
pute the pmf of W. Then compute pw and o. [Hint: 
Consider outcomes as unordered, so that (1, 3, 7) and (3, 
1, 7) are not different outcomes. Then there are 35 out- 
comes, and they can be listed. (This type of rv actually 
arises in connection with a statistical procedure called 
Wilcoxon’s rank-sum test, in which there is an x sample 
and a y sample and W is the sum of the ranks of the x’s in 
the combined sample; see Section 15.2.) 


After shuffling a deck of 52 cards, a dealer deals out 5. 
Let X = the number of suits represented in the five-card 
hand. 


a. Show that the pmf of X is 
x | 1 2 3 4 


146 588 .264 


P(x) 


[Hint: p(1) = 4P(all are spades), p(2) = 6P(only spades 
and hearts with at least one of each suit), and p(4) 
= 4P(2 spades MN one of each other suit). ] 

b. Compute p, 0”, and o. 


The negative binomial rv X was defined as the number of 
F’s preceding the rth S. Let Y = the number of trials 
necessary to obtain the rth S. In the same manner in 
which the pmf of X was derived, derive the pmf of Y. 


Of all customers purchasing automatic garage-door open- 
ers, 75% purchase a chain-driven model. Let X = the 
number among the next 15 purchasers who select the 
chain-driven model. 

a. What is the pmf of X? 

Compute P(X > 10). 

Compute P(6 = X = 10). 

Compute p and o?. 


cee 


If the store currently has in stock 10 chain-driven 
models and 8 shaft-driven models, what is the prob- 
ability that the requests of these 15 customers can all 
be met from existing stock? 


In some applications the distribution of a discrete rv X 
resembles the Poisson distribution except that zero is not 
a possible value of X. For example, let X¥ = the number 
of tattoos that an individual wants removed when she or 
he arrives at a tattoo-removal facility. Suppose the pmf 
of X is 


e x 


0 
p(x) =k 
x& 


¥= 1,2, 3555. 


a. Determine the value of k. Hint: The sum of all prob- 
abilities in the Poisson pmf is 1, and this pmf must 
also sum to 1. 


99. 


100. 


101. 


102. 


b. If the mean value of X is 2.313035, what is the prob- 
ability that an individual wants at most 5 tattoos 
removed? 

c. Determine the standard deviation of X when the 
mean value is as given in (b). 


[Note: The article “An Exploratory Investigation of 
Identity Negotiation and Tattoo Removal’ (Academy 
of Marketing Science Review, vol. 12, no. 6, 2008) gave 
a sample of 22 observations on the number of tattoos 
people wanted removed; estimates of and o calculated 
from the data were 2.318182 and 1.249242, respectively. | 


A k-out-of-n system is one that will function if and only 
if at least k of the n individual components in the system 
function. If individual components function indepen- 
dently of one another, each with probability .9, what is 
the probability that a 3-out-of-5 system functions? 


A manufacturer of integrated circuit chips wishes to con- 

trol the quality of its product by rejecting any batch in 

which the proportion of defective chips is too high. To 

this end, out of each batch (10,000 chips), 25 will be 

selected and tested. If at least 5 of these 25 are defective, 

the entire batch will be rejected. 

a. What is the probability that a batch will be rejected 
if 5% of the chips in the batch are in fact defective? 

b. Answer the question posed in (a) if the percentage of 
defective chips in the batch is 10%. 

c. Answer the question posed in (a) if the percentage of 
defective chips in the batch is 20%. 

d. What happens to the probabilities in (a)—(c) if the 
critical rejection number is increased from 5 to 6? 


Of the people passing through an airport metal detector, 
5% activate it; let X = the number among a randomly 
selected group of 500 who activate the detector. 

a. What is the (approximate) pmf of X? 

b. Compute P(X = 5). 

c. Compute P(5 = X). 


An educational consulting firm is trying to decide 
whether high school students who have never before 
used a hand-held calculator can solve a certain type of 
problem more easily with a calculator that uses reverse 
Polish logic or one that does not use this logic. A sam- 
ple of 25 students is selected and allowed to practice on 
both calculators. Then each student is asked to work one 
problem on the reverse Polish calculator and a similar 
problem on the other. Let p = P(S), where S indicates 
that a student worked the problem more quickly using 
reverse Polish logic than without, and let X = number 
of S’s. 

a. If p = .5, what is P(7 = X = 18)? 

b. If p = .8, what is P(7 = X = 18)? 
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c. If the claim that p = .5 is to be rejected when either 
x = 7 or x = 18, what is the probability of rejecting 
the claim when it is actually correct? 

d. If the decision to reject the claim p = .5 is made as 
in part (c), what is the probability that the claim is 
not rejected when p = .6? When p = .8? 

e. What decision rule would you choose for rejecting 
the claim p = .5 if you wanted the probability in part 
(c) to be at most .01? 


Consider a disease whose presence can be identified by 
carrying out a blood test. Let p denote the probability that 
a randomly selected individual has the disease. Suppose 
n individuals are independently selected for testing. One 
way to proceed is to carry out a separate test on each of 
the n blood samples. A potentially more economical 
approach, group testing, was introduced during World 
War II to identify syphilitic men among army inductees. 
First, take a part of each blood sample, combine these 
specimens, and carry out a single test. If no one has the 
disease, the result will be negative, and only the one test 
is required. If at least one individual is diseased, the test 
on the combined sample will yield a positive result, in 
which case the n individual tests are then carried out. If 
p =.1 and n = 3, what is the expected number of tests 
using this procedure? What is the expected number when 
n=5? [The article “Random Multiple-Access 
Communication and Group Testing” (IEEE Trans. on 
Commun., 1984: 769-774) applied these ideas to a com- 
munication system in which the dichotomy was active/ 
idle user rather than diseased/nondiseased. ] 


Let p, denote the probability that any particular code 
symbol is erroneously transmitted through a communica- 
tion system. Assume that on different symbols, errors 
occur independently of one another. Suppose also that 
with probability p, an erroneous symbol is corrected 
upon receipt. Let X denote the number of correct symbols 
in a message block consisting of n symbols (after the 
correction process has ended). What is the probability 
distribution of X? 


The purchaser of a power-generating unit requires c con- 
secutive successful start-ups before the unit will be 
accepted. Assume that the outcomes of individual start- 
ups are independent of one another. Let p denote the 
probability that any particular start-up is successful. The 
random variable of interest is X = the number of start- 
ups that must be made prior to acceptance. Give the pmf 
of X for the case c = 2. If p = .9, what is P(X = 8)? 
[Hint: For x = 5, express p(x) “recursively” in terms of 
thepmfevaluatedatthesmallervaluesx — 3, x — 4, ..., 2.] 
(This problem was suggested by the article ‘Evaluation 
of a Start-Up Demonstration Test,” J. Quality 
Technology, 1983: 103-106.) 


A plan for an executive travelers’ club has been devel- 
oped by an airline on the premise that 10% of its current 
customers would qualify for membership. 
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a. Assuming the validity of this premise, among 25 
randomly selected current customers, what is the 
probability that between 2 and 6 (inclusive) qualify 
for membership? 

b. Again assuming the validity of the premise, what are 
the expected number of customers who qualify and 
the standard deviation of the number who qualify in 
a random sample of 100 current customers? 

c. Let X denote the number in a random sample of 25 
current customers who qualify for membership. 
Consider rejecting the company’s premise in favor of 
the claim that p > .10 if x = 7. What is the probabil- 
ity that the company’s premise is rejected when it is 
actually valid? 

d. Refer to the decision rule introduced in part (c). 
What is the probability that the company’s premise is 
not rejected even though p = .20 (i.e., 20% qualify)? 


Forty percent of seeds from maize (modern-day corn) 
ears carry single spikelets, and the other 60% carry 
paired spikelets. A seed with single spikelets will pro- 
duce an ear with single spikelets 29% of the time, 
whereas a seed with paired spikelets will produce an ear 
with single spikelets 26% of the time. Consider randomly 
selecting ten seeds. 

a. What is the probability that exactly five of these 
seeds carry a single spikelet and produce an ear with 
a single spikelet? 

b. What is the probability that exactly five of the ears 
produced by these seeds have single spikelets? What 
is the probability that at most five ears have single 
spikelets? 


A trial has just resulted in a hung jury because eight 
members of the jury were in favor of a guilty verdict and 
the other four were for acquittal. If the jurors leave the 
jury room in random order and each of the first four 
leaving the room is accosted by a reporter in quest of an 
interview, what is the pmf of X = the number of jurors 
favoring acquittal among those interviewed? How many 
of those favoring acquittal do you expect to be inter- 
viewed? 


A reservation service employs five information operators 
who receive requests for information independently of 
one another, each according to a Poisson process with 
rate a = 2 per minute. 

a. What is the probability that during a given 1-min 
period, the first operator receives no requests? 

b. What is the probability that during a given 1-min 
period, exactly four of the five operators receive no 
requests? 

c. Write an expression for the probability that during a 
given 1-min period, all of the operators receive 
exactly the same number of requests. 


Grasshoppers are distributed at random in a large field 
according to a Poisson process with parameter a = 2 per 
square yard. How large should the radius R of a circular 


111. 


112. 


113. 


114. 


115. 


sampling region be taken so that the probability of find- 
ing at least one in the region equals .99? 


A newsstand has ordered five copies of a certain issue of 
a photography magazine. Let X = the number of individ- 
uals who come in to purchase this magazine. If X has a 
Poisson distribution with parameter w = 4, what is the 
expected number of copies that are sold? 


Individuals A and B begin to play a sequence of chess 

games. Let S = {A wins a game}, and suppose that out- 

comes of successive games are independent with P(S) = 

p and P(F) = 1 — p (they never draw). They will play 

until one of them wins ten games. Let X = the number of 

games played (with possible values 10, 11,..., 19). 

a. For x= 10,11, ...,19, obtain an expression for 
D(x) = P(X = x). 

b. If a draw is possible, with p = P(S), q = P(F), 
1 — p — q = P(draw), what are the possible values 
of X? What is P(20 =X)? [Hint: P(20 = X) = 
1 — P(X < 20).] 


A test for the presence of a certain disease has probability 

.20 of giving a false-positive reading (indicating that an 

individual has the disease when this is not the case) and 

probability .10 of giving a false-negative result. Suppose 

that ten individuals are tested, five of whom have the 

disease and five of whom do not. Let X = the number of 

positive readings that result. 

a. Does X have a binomial distribution? Explain your 
reasoning. 

b. What is the probability that exactly three of the ten 
test results are positive? 


The generalized negative binomial pmf is given by 


nb(x; r, p) = k(r, x) +p". — py 
x = 0, 1, 2,... 


Let X, the number of plants of a certain species found in 
a particular region, have this distribution with p = .3 and 
r = 2.5. What is P(X = 4)? What is the probability that 
at least one plant is found? 


There are two Certified Public Accountants in a particu- 
lar office who prepare tax returns for clients. Suppose 
that for a particular type of complex form, the number of 
errors made by the first preparer has a Poisson distribu- 
tion with mean value y2,, the number of errors made by 
the second preparer has a Poisson distribution with mean 
value j25, and that each CPA prepares the same number of 
forms of this type. Then if a form of this type is randomly 
selected, the function 

e Muy e us 
PO Py, My) = 5 BS 


x= 0; 1, 2,06. 


gives the pmf of X = the number of errors on the selected 

form. 

a. Verify that p(x; ,, >) is in fact a legitimate pmf 
(= 0 and sums to 1). 


116. 
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b. What is the expected number of errors on the selected 


form? 


c. What is the variance of the number of errors on the 


selected form? 


d. How does the pmf change if the first CPA prepares 


60% of all such forms and the second prepares 40%? 


The mode of a discrete random variable X with pmf p(x) 
is that value x* for which p(x) is largest (the most proba- 
ble x value). 

a. Let X~Bin(n,p). By considering the ratio 
b(x + 1; n,p)/b(x; n, p), show that b(x; n, p) increases 
with x as long as x < np — (1 — p). Conclude that 
the mode x* is the integer satisfying (nm + l1)p— 
1Sx*S(n + 1). 

b. Show that if X has a Poisson distribution with param- 
eter p, the mode is the largest integer less than wp. If 
pm is an integer, show that both w — 1 and p are 
modes. 


A computer disk storage device has ten concentric tracks, 
numbered 1, 2,..., 10 from outermost to innermost, and a 
single access arm. Let p, = the probability that any partic- 
ular request for data will take the arm to track 
i(i = 1,... , 10). Assume that the tracks accessed in succes- 
sive seeks are independent. Let X = the number of tracks 
over which the access arm passes during two successive 
requests (excluding the track that the arm has just left, so 
possible X values are x = 0,1, ..., 9). Compute the pmf 
of X. [Hint: P(the arm is now on track i and X = j) = 
P(X = jlarmnowon i): p; After the conditional 
probability is written in terms of p,,..., Pio, by the law of 
total probability, the desired probability is obtained by 
summing over i.] 


If X is a hypergeometric rv, show directly from the defi- 
nition that E(X) = nM/N (consider only the case n < M),. 
[Hint: Factor nM/N out of the sum for E(X), and show 
that the terms inside the sum are of the form 
h(iy;n -— 1,M—1,N-— 1), where y=x— 1.] 


Use the fact that 


YSoa-pypwW= DY @w- pp 


all x x: |x—pl=ko 
to prove Chebyshev’s inequality given in Exercise 44. 


The simple Poisson process of Section 3.6 is character- 
ized by a constant rate a at which events occur per unit 
time. A generalization of this is to suppose that the prob- 
ability of exactly one event occurring in the interval 
[t,t + Ar] is a(t) - At + o(Ad). It can then be shown that 
the number of events occurring during an interval [f,, 4] 
has a Poisson distribution with parameter 


w= | ‘a(t) dt 


The occurrence of events over time in this situation is 
called a nonhomogeneous Poisson process. The article 
“Inference Based on Retrospective Ascertainment,”’ 
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(J. Amer. Stat. Assoc., 1989: 360-372), considers the 
intensity function 


a(t) = eat ht 


as appropriate for events involving transmission of HIV 
(the AIDS virus) via blood transfusions. Suppose that 
a = 2 and b = .6 (close to values suggested in the paper), 
with time in years. 
a. What is the expected number of events in the interval 
[0, 4]? In [2, 6]? 
b. What is the probability that at most 15 events occur in 
the interval [0, .9907]? 


121. Consider a collection A;,..., A, of mutually exclusive and 
exhaustive events, and a random variable X whose distri- 
bution depends on which of the A,’s occurs (e.g., a com- 
muter might select one of three possible routes from home 
to work, with X representing the commute time). Let 
E(X|A,) denote the expected value of X given that the event 
A; occurs. Then it can be shown that E(X) = 

SE(X|A;) - P(A,), the weighted average of the individual 
“conditional expectations” where the weights are the prob- 
abilities of the partitioning events. 

a. The expected duration of a voice call to a particular 

telephone number is 3 minutes, whereas the expected 
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duration of a data call to that same number is 1 minute. 
If 75% of all calls are voice calls, what is the expected 
duration of the next call? 

b. A deli sells three different types of chocolate chip 
cookies. The number of chocolate chips in a type i 
cookie has a Poisson distribution with parameter 
w,=itl @=1,2,3). If 20% of all customers 
purchasing a chocolate chip cookie select the first 
type, 50% choose the second type, and the remaining 
30% opt for the third type, what is the expected num- 
ber of chips in a cookie purchased by the next cus- 
tomer? 


122. Consider a communication source that transmits packets 
containing digitized speech. After each transmission, the 
receiver sends a message indicating whether the transmis- 
sion was successful or unsuccessful. If a transmission is 
unsuccessful, the packet is re-sent. Suppose a voice 
packet can be transmitted a maximum of 10 times. 
Assuming that the results of successive transmissions are 
independent of one another and that the probability of any 
particular transmission being successful is p, determine 
the probability mass function of the rv X = the number of 
times a packet is transmitted. Then obtain an expression 
for the expected number of times a packet is transmitted. 


properties of discrete and continuous distributions and 
results for specific distributions. 

Ross, Sheldon, /ntroduction to Probability Models (10th ed.), 
Academic Press, New York, 2010. A good source of mate- 
rial on the Poisson process and generalizations, and a nice 
introduction to other topics in applied probability. 


Continuous Random 


Variables and Probability 
Distnbutions 


INTRODUCTION 


Chapter 3 concentrated on the development of probability distributions for dis- 
crete random variables. In this chapter, we consider the second general type of 
random variable that arises in many applied problems. Sections 4.1 and 4.2 
present the basic definitions and properties of continuous random variables and 
their probability distributions. In Section 4.3, we study in detail the normal ran- 
dom variable and distribution, unquestionably the most important and useful in 
probability and statistics. Sections 4.4 and 4.5 discuss some other continuous 
distributions that are often used in applied work. In Section 4.6, we introduce a 
method for assessing whether given sample data is consistent with a specified 
distribution. 


141 
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4.1 Probability Density Functions 


A discrete random variable (rv) is one whose possible values either constitute a finite 
set or else can be listed in an infinite sequence (a list in which there is a first element, 
a second element, etc.). A random variable whose set of possible values is an entire 
interval of numbers is not discrete. 

Recall from Chapter 3 that a random variable X is continuous if (1) possible 
values comprise either a single interval on the number line (for some A < B, any 
number x between A and B is a possible value) or a union of disjoint intervals, and 
(2) P(X = c) = 0 for any number c that is a possible value of X. 


EXAMPLE 4.1 _ If in the study of the ecology of a lake, we make depth measurements at randomly 
chosen locations, then X = the depth at such a location is a continuous rv. Here A is 
the minimum depth in the region being sampled, and B is the maximum depth. M& 


EXAMPLE 4.2 If a chemical compound is randomly selected and its pH X is determined, then X is 
a continuous rv because any pH value between 0 and 14 is possible. If more is known 
about the compound selected for analysis, then the set of possible values might be a 
subinterval of [0, 14], such as 5.5 = x = 6.5, but X would still be continuous. |_| 


EXAMPLE 4.3 Let X represent the amount of time a randomly selected customer spends waiting for 
a haircut before his/her haircut commences. Your first thought might be that X is 
a continuous random variable, since a measurement is required to determine its 
value. However, there are customers lucky enough to have no wait whatsoever 
before climbing into the barber’s chair. So it must be the case that P(X = 0) > 0. 
Conditional on no chairs being empty, though, the waiting time will be continuous 
since X could then assume any value between some minimum possible time A and a 
maximum possible time B. This random variable is neither purely discrete nor purely 
continuous but instead is a mixture of the two types. a 


One might argue that although in principle variables such as height, weight, 
and temperature are continuous, in practice the limitations of our measuring instru- 
ments restrict us to a discrete (though sometimes very finely subdivided) world. 
However, continuous models often approximate real-world situations very well, and 
continuous mathematics (the calculus) is frequently easier to work with than math- 
ematics of discrete variables and distributions. 


Probability Distributions for Continuous 
Variables 


Suppose the variable X of interest is the depth of a lake at a randomly chosen point 
on the surface. Let M = the maximum depth (in meters), so that any number in the 
interval [0, M] is a possible value of X. If we “discretize” X by measuring depth to 
the nearest meter, then possible values are nonnegative integers less than or equal to 
M. The resulting discrete distribution of depth can be pictured using a probability his- 
togram. If we draw the histogram so that the area of the rectangle above any possible 
integer k is the proportion of the lake whose depth is (to the nearest meter) k, then 
the total area of all rectangles is 1. A possible histogram appears in Figure 4.1(a). 

If depth is measured much more accurately and the same measurement axis as 
in Figure 4.1(a) is used, each rectangle in the resulting probability histogram is much 
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EXAMPLE 4.4 
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narrower, though the total area of all rectangles is still 1. A possible histogram is 
pictured in Figure 4.1(b); it has a much smoother appearance than the histogram in 
Figure 4.1(a). If we continue in this way to measure depth more and more finely, the 
resulting sequence of histograms approaches a smooth curve, such as is pictured in 
Figure 4.1(c). Because for each histogram the total area of all rectangles equals 1, 
the total area under the smooth curve is also 1. The probability that the depth at a 
randomly chosen point is between a and b is just the area under the smooth curve 
between a and D. It is exactly a smooth curve of the type pictured in Figure 4.1(c) 
that specifies a continuous probability distribution. 


(a) (b) (c) 


Figure 4.1 (a) Probability histogram of depth measured to the nearest meter; (b) probability 
histogram of depth measured to the nearest centimeter; (c) a limit of a sequence of discrete 
histograms 


Let X be a continuous rv. Then a probability distribution or probability den- 
sity function (pdf) of X is a function f(x) such that for any two numbers a and 
bwitha=b, 


b 
PiasxX=b)= | fea 
That is, the probability that X takes on a value in the interval [a, b] is the area 


above this interval and under the graph of the density function, as illustrated in 
Figure 4.2. The graph of f(x) is often referred to as the density curve. 


fx) 


a b 


Figure 4.2 P(a= X= 5) = the area under the density curve between a and b 


For f(x) to be a legitimate pdf, it must satisfy the following two conditions: 


1. f(x) = 0 for all x 


2. | f(x) dx = area under the entire graph of f(x) 
= 1 


The direction of an imperfection with respect to a reference line on a circular object 
such as a tire, brake rotor, or flywheel is, in general, subject to uncertainty. Consider 
the reference line connecting the valve stem on a tire to the center point, and let X 
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be the angle measured clockwise to the location of an imperfection. One possible 


pdf for X is 
1 
— 0=x< 360 
fx) = 4 360 
0 otherwise 


The pdf is graphed in Figure 4.3. Clearly f(x) = 0. The area under the density curve 
is just the area of a rectangle: (height)(base) = (1/360)(360) = 1. The probability 
that the angle is between 90° and 180° is 


180 1 x= 180 1 


Xx 
P(90 < X < 180) = dx = =— = 25 
( ) i. 360" 360 |s=90 4 


The probability that the angle of occurrence is within 90° of the reference line is 


PO =X = 90) + P(270 = X < 360) = .25 + .25 = .50 


fx) fx) 


Shaded area = P(90 = X =180) 


360 
x 
0 360 90 180 270 360 
Figure 4.3 The pdf and probability from Example 4.4 | 
Because whenever 0 = a = b = 360 in Example 4.4, P(a = X = b) depends only 
on the width b — a of the interval, X is said to have a uniform distribution. 
DEFINITION A continuous rv X is said to have a uniform distribution on the interval 


[A, B] if the pdf of X is 


fx; A,B)={B-A 


0 otherwise 


The graph of any uniform pdf looks like the graph in Figure 4.3 except that the inter- 
val of positive density is [A, B] rather than [0, 360]. 

In the discrete case, a probability mass function (pmf) tells us how little 
“blobs” of probability mass of various magnitudes are distributed along the mea- 
surement axis. In the continuous case, probability density is “smeared” in a continu- 
ous fashion along the interval of possible values. When density is smeared uniformly 
over the interval, a uniform pdf, as in Figure 4.3, results. 

When X is a discrete random variable, each possible value is assigned positive 
probability. This is not true of a continuous random variable (that is, the second 
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condition of the definition is satisfied) because the area under a density curve that 
lies above any single value is zero: 


c cte 
P(X =c)= | foray = im | S(x)dx = 0 

The fact that P(X = c) = 0 when X is continuous has an important practical 
consequence: The probability that X lies in some interval between a and b does not 
depend on whether the lower limit a or the upper limit b is included in the prob- 


ability calculation: 
PasxX=b)=Pa<xX<b)=Pa<X=b)=Pasx<b) (4.1) 


If X is discrete and both a and b are possible values (e.g., X is binomial with n = 20 
and a = 5, b = 10), then all four of the probabilities in (4.1) are different. 

The zero probability condition has a physical analog. Consider a solid circular 
rod with cross-sectional area = 1 in’. Place the rod alongside a measurement axis 
and suppose that the density of the rod at any point x is given by the value f(x) of a 
density function. Then if the rod is sliced at points a and b and this segment is 
removed, the amount of mass removed is |? f(x) dx; if the rod is sliced just at the 
point c, no mass is removed. Mass is assigned to interval segments of the rod but 
not to individual points. 


EXAMPLE 4.5 “Time headway” in traffic flow is the elapsed time between the time that one car 
finishes passing a fixed point and the instant that the next car begins to pass that 
point. Let X = the time headway for two randomly chosen consecutive cars on a 
freeway during a period of heavy flow. The following pdf of X is essentially the one 
suggested in “‘The Statistical Properties of Freeway Traffic” (Transp. Res., vol. 
11: 221-228): 


Ue Pe) yy & 5 
0) otherwise 


FQ) = | 


The graph of f(x) is given in Figure 4.4; there is no density associated with 
headway times less than .5, and headway density decreases rapidly (exponentially 
fast) as x increases from .5. Clearly, f(x) = 0; to show that |”, f(x)dx = 1, we use 
the calculus result |*e~* dx = (I/kje~*"*. Then 


| f(x) dx -| 15e7 56-3) dy = 156%] en 1S ay 
= Pr 5 


= 15e%.- a ee1sKs =s| 


F(x) 4 
154 


is | 
ty — -------------------, 


Figure 4.4 The density curve for time headway in Example 4.5 
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The probability that headway time is at most 5 sec is 


P(X S5) 


5 


ase] 


5 


5 


1 
—.15x q = 15 OFS" s,s —.15x 
e xX e ( 15 e 


5 
| F(x) dx = | (15e7 50-5) dy 
ica 5 


x=5 
x=.5 


= ¢915(—e~75 + e915) = 1.078(—.472 + .928) = 491 


Piless than 5 sec) = P(X < 5) a 


Unlike discrete distributions such as the binomial, hypergeometric, and nega- 
tive binomial, the distribution of any given continuous rv cannot usually be derived 
using simple probabilistic arguments. Instead, one must make a judicious choice of 
pdf based on prior knowledge and available data. Fortunately, there are some general 
families of pdf’s that have been found to be sensible candidates in a wide variety of 
experimental situations; several of these are discussed later in the chapter. 

Just as in the discrete case, it is often helpful to think of the population of 
interest as consisting of X values rather than individuals or objects. The pdf is then 
a model for the distribution of values in this numerical population, and from this 
model various population characteristics (such as the mean) can be calculated. 


EXERCISES Section 4.1 (1-10) 


The current in a certain circuit as measured by an amme- 
ter is a continuous random variable X with the following 
density function: 


O075x+.2 35x55 
0 otherwise 


ae 


a. Graph the pdf and verify that the total area under the 
density curve is indeed 1. 

b. Calculate P(X = 4). How does this probability com- 
pare to P(X < 4)? 

ec. Calculate P(3.5 = X = 4.5) and also P(4.5 < X). 


Suppose the reaction temperature X (in °C) in a certain 

chemical process has a uniform distribution with A = —5 

and B = 5. 

a. Compute P(X < 0). 

b. Compute P(—2.5 < X < 2.5). 

c. Compute P(—2 = X S 3). 

d. For k satisfying -5<k<k+4<5, compute 
PRK<X<k+4). 

The error involved in making a certain measurement is a 

continuous rv X with pdf 


.09375(4 — x?) -2<x=2 
0 otherwise 


fo=| 


a. Sketch the graph of f(x). 
b. Compute P(X > 0). 
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c. Compute P(—1 < xX < 1). 
d. Compute P(X < —.5 or X > .5). 


Let X denote the vibratory stress (psi) on a wind tur- 
bine blade at a particular wind speed in a wind tunnel. 
The article “Blade Fatigue Life Assessment with 
Application to VAWTS” (J. of Solar Energy Engr., 1982: 
107-111) proposes the Rayleigh distribution, with pdf 


ae © eo? /(20") 


>0 
fo, 0) = 4 @ ° 


0 otherwise 


as a model for the X distribution. 

a. Verify that f(x; @) is a legitimate pdf. 

b. Suppose 6 = 100 (a value suggested by a graph in 
the article). What is the probability that X is at most 
200? Less than 200? At least 200? 

c. What is the probability that X is between 100 and 200 
(again assuming 0 = 100)? 

d. Give an expression for P(X = x). 


A college professor never finishes his lecture before the 
end of the hour and always finishes his lectures within 
2 min after the hour. Let X = the time that elapses 
between the end of the hour and the end of the lecture and 
suppose the pdf of X is 


kx? OSx52 


0 otherwise 


fo =| 


a. Find the value of & and draw the corresponding density 
curve. [Hint: Total area under the graph of f(x) is 1.] 

b. What is the probability that the lecture ends within 
1 min of the end of the hour? 

c. What is the probability that the lecture continues 
beyond the hour for between 60 and 90 sec? 

d. What is the probability that the lecture continues for 
at least 90 sec beyond the end of the hour? 


The actual tracking weight of a stereo cartridge that is set 
to track at 3 g on a particular changer can be regarded as 
a continuous rv X with pdf 


a feo 32 <;y= 
peo = {ft (x -—3)] 2<x<4 


0 otherwise 


a. Sketch the graph of f(x). 
Find the value of k. 

c. What is the probability that the actual tracking weight 
is greater than the prescribed weight? 

d. What is the probability that the actual weight is 
within .25 g of the prescribed weight? 

e. What is the probability that the actual weight differs 
from the prescribed weight by more than .5 g? 


The article “Second Moment Reliability Evaluation 

vs. Monte Carlo Simulations for Weld Fatigue 

Strength” (Quality and Reliability Engr. Intl., 2012: 

887-896) considered the use of a uniform distribution 

with A = .20 and B = 4.25 for the diameter X of a certain 

type of weld (mm). 

a. Determine the pdf of X and graph it. 

b. What is the probability that diameter exceeds 3 mm? 

c. What is the probability that diameter is within 1 mm 
of the mean diameter? 

d. For any value a satisfying .20<a<a+1< 4.25, 
what is Pa < X <a+1)? 


In commuting to work, a professor must first get on a bus 
near her house and then transfer to a second bus. If the 
waiting time (in minutes) at each stop has a uniform 
distribution with A = 0 and B = 5, then it can be shown 
that the total waiting time Y has the pdf 
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1 
a5? Oxy<5 
fM=42 1 
SS Sy 3S y= 10 
, 2 ~~ 
0 y<Oory> 10 


a. Sketch a graph of the pdf of Y. 

b. Verify that |, f(y) dy = 1. 

c. What is the probability that total waiting time is at 
most 3 min? 

d. What is the probability that total waiting time is at 
most 8 min? 

e. What is the probability that total waiting time is 
between 3 and 8 min? 

f. What is the probability that total waiting time is 
either less than 2 min or more than 6 min? 


Based on an analysis of sample data, the article 

‘Pedestrians’ Crossing Behaviors and Safety at 

Unmarked Roadways in China” (Accident Analysis 

and Prevention, 2011: 1927-1936) proposed the pdf 

fx) = 15e7-@-) when x = 1 as a model for the distribu- 

tion of X = time (sec) spent at the median line. 

a. What is the probability that waiting time is at most 
5 sec? More than 5 sec? 

b. What is the probability that waiting time is between 
2 and 5 sec? 


10. A family of pdf’s that has been used to approximate the 


distribution of income, city population size, and size of 
firms is the Pareto family. The family has two parameters, 
k and 6, both > 0, and the pdf is 
k- 0k 
fs k, 0) = 9 xk! 
O x<80 


20 


a. Sketch the graph of f(x; k, 0). 
. Verify that the total area under the graph equals 1. 
c. If the rv X has pdf f(x; k, 0), for any fixed b > 0, 
obtain an expression for P(X = b). 


d. For 6 <a <b), obtain an expression for the probabil- 
ity Pas=X=b). 


4.2 Cumulative Distribution Functions 


and Expected Values 


Several of the most important concepts introduced in the study of discrete distribu- 
tions also play an important role for continuous distributions. Definitions analogous 
to those in Chapter 3 involve replacing summation by integration. 
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The Cumulative Distribution Function 


The cumulative distribution function (cdf) F(x) for a discrete rv X gives, for any speci- 
fied number x, the probability P(X = x). It is obtained by summing the pmf p(y) over all 
possible values y satisfying y = x. The cdf of a continuous rv gives the same probabili- 
ties P(X = x) and is obtained by integrating the pdf f(y) between the limits —° and x. 


DEFINITION The cumulative distribution function F(x) for a continuous rv X is defined 
for every number x by 


Fa) = Px S0= | fody 


For each x, F(x) is the area under the density curve to the left of x. This is illus- 
trated in Figure 4.5, where F(x) increases smoothly as x increases. 


f(x) 4 F(x) 4 


4 F(8) 1- 
F(8) +|- - ------ 


4 Oe] 


10 


t 
8 


5 t 10 5 
8 


Figure 4.5 A pdf and associated cdf 


EXAMPLE 4.6 Let X, the thickness of a certain metal sheet, have a uniform distribution on 
[A, B]. The density function is shown in Figure 4.6. For x <A, F(x) = 0, since 
there is no area under the graph of the density function to the left of such an x. For 
x = B, F(x) = 1, since all the area is accumulated to the left of such an x. Finally, for 
AS=x=B, 


‘ ~ i 1 
F(x) = a =| dy = : 
(x) | fos Ve=A” Bea 


f(x) f(x) 4 
Shaded area = F(x) 


& 
T|R 
> 
& 
Ie 
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wH 


y 
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Figure 4.6 The pdf for a uniform distribution 


The entire cdf is 


0 x<A 
Raja)? epee 
B-A 
1 x=B 


The graph of this cdf appears in Figure 4.7. 
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PROPOSITION 


EXAMPLE 4.7 
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F(x) 


> 
T 


A B x 


Figure 4.7 The cdf for a uniform distribution | 


Using F(x) to Compute Probabilities 


The importance of the cdf here, just as for discrete rv’s, is that probabilities of vari- 
ous intervals can be computed from a formula for or table of F(x). 


Let X be a continuous rv with pdf f(x) and cdf F(x). Then for any number a, 
P(X >a)=1-— Fa) 
and for any two numbers a and b with a < b, 


Pa@=X=b) = F(b) — Fa) 


Figure 4.8 illustrates the second part of this proposition; the desired probability is 
the shaded area under the density curve between a and b, and it equals the difference 
between the two shaded cumulative areas. This is different from what is appro- 
priate for a discrete integer-valued random variable (e.g., binomial or Poisson): 
P(a = X = b) = F(b) — F(a — 1) when a and b are integers. 


fx) 


a b b a 


Figure 4.8 Computing P(a <= X S 5) from cumulative probabilities 


Suppose the pdf of the magnitude X of a dynamic load on a bridge (in newtons) is 

given by 

3 

Sa VS 7S 2 

fm= 78 8 
0 otherwise 


For any number x between 0 and 2, 


7 “iy... 3 x 3 
Fx) =| fod -{{ + y)a =—4 (2 
(x) [10 y o\8 3” 'y 8 16 


Thus 
0 x<0O 
F(x) = eg OsSxs2 
8 16 
1 2x 
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The graphs of f(x) and F(x) are shown in Figure 4.9. The probability that the load 
is between | and 1.5 is 


PU. <X < 155) = F(1.5) — F(1) 


= = (15) + S15) ais = ay 
8. 16° — 8 16 
1 
af 297 
64 
The probability that the load exceeds | is 


P(X>1)=1-P(XX<1)=1-F()=1 Foo + 2 coy] 


= 688 
16 — 
fx) 4 F(x) 4 
1 4 
al 
8 
1 
8 
a Ori ~— x 
0 2 2 
Figure 4.9 The pdf and cdf for Example 4.7 @ 


Once the cdf has been obtained, any probability involving X can easily be cal- 
culated without any further integration. 


Obtaining f(x) from F(x) 


For X discrete, the pmf is obtained from the cdf by taking the difference between two 
F(x) values. The continuous analog of a difference is a derivative. The following 
result is a consequence of the Fundamental Theorem of Calculus. 


PROPOSITION If X is a continuous rv with pdf f(x) and cdf F(x), then at every x at which the 
derivative F'(x) exists, F'(x) = f(x). 


EXAMPLE 4.8 When X has a uniform distribution, F(x) is differentiable except at x = A and x = B, 
(Example 4.6 where the graph of F(x) has sharp corners. Since F(x) = 0 for x < A and F(x) = 1 


continued) for x > B, F'(x) = 0 = f(x) for such x. For A < x < B, 
d[{x—-A il 
F'(x) = = = te) 
@) (54) p-a 7“) 


Percentiles of a Continuous Distribution 


When we say that an individual’s test score was at the 85th percentile of the popu- 
lation, we mean that 85% of all population scores were below that score and 15% 
were above. Similarly, the 40th percentile is the score that exceeds 40% of all scores 
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and is exceeded by 60% of all scores (having a value corresponding to a high per- 
centile is not necessarily good; e.g., you would not want to be at the 99th percentile 
for blood alcohol content). 


DEFINITION Let p be a number between 0 and 1. The (100p)th percentile of the distribu- 
tion of a continuous rv X, denoted by 7(p), is defined by 


H(P) 
jp = KGa) = | fy) dy (4.2) 


—o 


According to Expression (4.2), 1(p) is that value on the measurement axis such that 
100p% of the area under the graph of f(x) lies to the left of n(p) and 100(1 — p)% 
lies to the right. Thus 7(.75), the 75th percentile, is such that the area under the graph 
of f(x) to the left of n(.75) is .75. Figure 4.10 illustrates the definition. 


F) 4 F(x) 
Shaded area = p 1 


p= F@(p)) 


n(P) n(P) x 


Figure 4.10 The (100p)th percentile of a continuous distribution 


EXAMPLE 4.9 The distribution of the amount of gravel (in tons) sold by a particular construction 
supply company in a given week is a continuous rv X with pdf 


fa) = S- 23) Osxsl1 


0) otherwise 


yer 3 3 
= XxX 
yay 3 


The graphs of both f(x) and F(x) appear in Figure 4.11. The (100p)th percentile of 
this distribution satisfies the equation 


The cdf of sales for any x between 0 and | is 


3 3 y> 
1-y)dy= 
mi y’) dy (> 


x 


F(x) = | 


0 


3 


3 fe) 
p = F(y(p)) = 5 0) mie | 


that is, 


(n(p)° — 3n(p) + 2p = 0 


For the 50th percentile, p = .5, and the equation to be solved is n°? — 3n + 1 = 0; 
the solution is 71 = n(.5) = .347. If the distribution remains the same from week to 
week, then in the long run 50% of all weeks will result in sales of less than .347 ton 
and 50% in more than .347 ton. 
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F() 4 
1.57 


> 
T 


T 
0 1 x 0 .347 1 x 


Figure 4.11 The pdf and cdf for Example 4.9 a 


DEFINITION The median of a continuous distribution, denoted by J, is the 50th percentile, 
so pt satisfies .5 = F(j1). That is, half the area under the density curve is to the 
left of jz and half is to the right of p. 


A continuous distribution whose pdf is symmetric—the graph of the pdf to the 
left of some point is a mirror image of the graph to the right of that point—has 
median pt equal to the point of symmetry, since half the area under the curve lies 
to either side of this point. Figure 4.12 gives several examples. The error in 
a measurement of a physical quantity is often assumed to have a symmetric 
distribution. 


fx) Ff) f) 


A gf B i fl 


Figure 4.12 Medians of symmetric distributions 


Expected Values 


For a discrete random variable X, E(X) was obtained by summing x - p(x) over possi- 
ble X values. Here we replace summation by integration and the pmf by the pdf to 
get a continuous weighted average. 


DEFINITION The expected or mean value of a continuous rv X with pdf f(x) is 


py = ECO) = | 5g 2A CHabe 
EXAMPLE 4.10 The pdf of weekly gravel sales X was 
(Example 4.9 3 
continued) -—(1-—x7) 0Sx<1 
f@) = 
0 otherwise 
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so 


w 1 
E(X) = | x+ f(x)dx = | x: = — x*)dx 


0 

3 1 3 x? x4 2 
3) dx 

aK wee al all 


When the pdf f(x) specifies a model for the distribution of values in a numeri- 
cal population, then p is the population mean, which is the most frequently used 
measure of population location or center. 

Often we wish to compute the expected value of some function h(X) of the 
rv X. If we think of h(X) as a new rv Y, techniques from mathematical statistics can 
be used to derive the pdf of Y, and E(Y) can then be computed from the definition. 
Fortunately, as in the discrete case, there is an easier way to compute E[h(X)]. 


PROPOSITION If X is a continuous rv with pdf f(x) and A(X) is any function of X, then 


EYAQX)] = thigy = IE h(x) - fxd 


That is, just as E(X) is a weighted average of possible X values, where the weighting 
function is the pdf f(x), E[h(X)] is a weighted average of h(X) values. 


EXAMPLE 4.11 Two species are competing in a region for control of a limited amount of a certain 
resource. Let X = the proportion of the resource controlled by species | and suppose 
X has pdf 

1 OsSx<=1 


0 otherwise 


f= | 


which is a uniform distribution on [0, 1]. (In her book Ecological Diversity, E. C. 
Pielou calls this the ““broken-stick’”’ model for resource allocation, since it is analo- 
gous to breaking a stick at a randomly chosen point.) Then the species that controls 
the majority of this resource controls the amount 


1 
=X if0=X<5 

A(X) = max (X, 1 — X) = 1 
X if ;=X=1 


The expected amount controlled by the species having majority control is then 


00 1 
E{h(X)] = | max(x, | — x) - f(x)dx = | max(x, | — x) - ldx 
—o 0 


1/2 1 3 
-| (=x-tde+ | re tae= 9 a 
0 12 4 

In the discrete case, the variance of X was defined as the expected squared devia- 
tion from p and was calculated by summation. Here again integration replaces 
summation. 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


154 CHAPTER 4 Continuous Random Variables and Probability Distributions 


DEFINITION The variance of a continuous random variable X with pdf f(x) and mean value 
pis 


o2 = V(X) = | (x — pw)? + f(x)dx = E[(X — p)] 


The standard deviation (SD) of X is 0, = V V(X). 


The variance and standard deviation give quantitative measures of how much spread 
there is in the distribution or population of x values. Again o is roughly the size of 
a typical deviation from jz. Computation of 0? is facilitated by using the same short- 
cut formula employed in the discrete case. 


PROPOSITION V(X) = E(x’) — [ECOP 


EXAMPLE 4.12 For X = weekly gravel sales, we computed E(X) = 3. Since 
(Example 4.10 


continued) F(X) = | 


—00 


oo 


rl 
x? + f(x) dx = | x? = — x’) dx 


0 


ae 1 
-| =(x? — xSdx == 
32 5 


1 /3\2 19 
vo = 5 -( = 359 7-059 and oy = .244 = 


When h(X) = aX + b, the expected value and variance of h(X) satisfy the same 
properties as in the discrete case: E[h(X)] = ay + b and V[h(X)] = a? - a. 


EXERCISES Section 4.2 (11-27) 


11. Let X denote the amount of time a book on two-hour g. Calculate V(X) and oy. 

reserve is actually checked out, and suppose the cdf is h. If the borrower is charged an amount h(X) = X? 
0 x<0 when checkout duration is X, compute the expected 
2 charge E[h(X)]. 

FQ) = 4 Osx<2 12. The cdf for X (= measurement error) of Exercise 3 is 

1 2% 0 ge5 

a. Calculate P(X <= 1). 1 3 x3 

b. Calculate P(.5 <X <1). PO) 5 39 G ) a 

ce. Calculate P(X > 1.5). 1 2<=x 

d. What is the median checkout duration ,1? [solve 

5 = F(p)]. a. Compute P(X < 0). 
e. Obtain the density function f(x). b. Compute P(—1 < X < 1). 
f. Calculate E(X). c. Compute P(.5 < X). 
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13. 


14. 


15. 


16. 


d. Verify that f(x) is as given in Exercise 3 by obtaining 
F'(x). 
e. Verify that p = 0. 


Example 4.5 introduced the concept of time headway in 
traffic flow and proposed a particular distribution for X = 
the headway between two randomly selected consecutive 
cars (sec). Suppose that in a different traffic environment, 
the distribution of time headway has the form 


es 
f(x) = 4 x* 
0 x=!1 


a. Determine the value of k for which f(x) is a legiti- 
mate pdf. 

Obtain the cumulative distribution function. 

c. Use the cdf from (b) to determine the probability that 
headway exceeds 2 sec and also the probability that 
headway is between 2 and 3 sec. 

d. Obtain the mean value of headway and the standard 
deviation of headway. 

e. What is the probability that headway is within | stan- 
dard deviation of the mean value? 


The article “Modeling Sediment and Water Column 

Interactions for Hydrophobic Pollutants” (Water 

Research, 1984: 1169-1174) suggests the uniform dis- 

tribution on the interval (7.5, 20) as a model for depth 

(cm) of the bioturbation layer in sediment in a certain 

region. 

a. What are the mean and variance of depth? 

b. What is the cdf of depth? 

c. What is the probability that observed depth is at most 
10? Between 10 and 15? 

d. What is the probability that the observed depth is 
within | standard deviation of the mean value? Within 
2 standard deviations? 


Let X denote the amount of space occupied by an article 
placed in a 1-ft? packing container. The pdf of X is 


Z Faas —x) 0<x<1 
Fe) = 0 otherwise 
a. Graph the pdf. Then obtain the cdf of X and graph it. 
. What is P(X = .5) [ie., F(.5)]? 
ce. Using the cdf from (a), what is P(.25 << X =.5)? 
What is P(.25 = X = .5)? 
d. What is the 75th percentile of the distribution? 
Compute E(X) and oy. 
f. What is the probability that X is more than 1 standard 
deviation from its mean value? 


The article “‘A Model of Pedestrians’ Waiting Times 
for Street Crossings at Signalized Intersections” 
(Transportation Research, 2013: 17-28) suggested that 
under some circumstances the distribution of waiting 
time X could be modeled with the following pdf: 


sg 
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17. 


18. 


19. 


20. 


6 
—(1-—x/7)*! OSx<t 
- 


f(x; 8, 7) = 


0 otherwise 


a. Graph f(x; 0, 80) for the three cases 0 = 4, 1, and .5 
(these graphs appear in the cited article) and com- 
ment on their shapes. 

Obtain the cumulative distribution function of X. 

c. Obtain an expression for the median of the waiting 
time distribution. 

d. For the case 6 = 4, 7 = 80, calculate P(50 = X = 70) 
without at this point doing any additional integration. 


Let X have a uniform distribution on the interval [A, B]. 

a. Obtain an expression for the (100p)th percentile. 

b. Compute E(X), V(X), and oy. 

c. For n, a positive integer, compute E(X”). 

Let X denote the voltage at the output of a microphone, 

and suppose that X has a uniform distribution on the 

interval from —1 to 1. The voltage is processed by a 

“hard limiter” with cutoff values —.5 and .5, so the lim- 

iter output is a random variable Y related to X by Y = X 

if IXl=.5,Y= 5if X> 5, and Y= —.5if X< —.5. 

a. What is P(Y = .5)? 

b. Obtain the cumulative distribution function of Y and 
graph it. 


Let X be a continuous rv with cdf 


0 x=0 
4 
F(x) =4—/1 4+ Int — O0<x=4 
x 
1 xe4 


[This type of cdf is suggested in the article 
“Variability in Measured Bedload-Transport Rates” 
(Water Resources Bull., 1985: 39-48) as a model for a 
certain hydrologic variable.] What is 

a. P(X = 1)? 

b PU =X S3)? 

c. The pdf of X? 


Consider the pdf for total waiting time Y for two buses 


1 
— Osy< 
a5 = 
f(y) Ot ea 
5 ae 
0 otherwise 


introduced in Exercise 8. 

a. Compute and sketch the cdf of Y. [Hint: Consider 
separately 0 = y <5 and 5 = y = 10 in computing 
F(y). A graph of the pdf should be helpful.] 

b. Obtain an expression for the (100p)th percentile. [Hint: 
Consider separately 0 < p< .5and.5<p<1.] 
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c. Compute E(Y) and V(Y). How do these compare with 25. 


the expected waiting time and variance for a single 
bus when the time is uniformly distributed on [0, 5]? 


21. An ecologist wishes to mark off a circular sampling 
region having radius 10 m. However, the radius of the 
resulting region is actually a random variable R with pdf 


35 = [1 = 10 = 734 9<r<I1 
nes 


0 otherwise 


What is the expected area of the resulting circular region? 26. 


22. The weekly demand for propane gas (in 1000s of gallons) 
from a particular facility is an rv X with pdf 


fl) = a 2 
0 otherwise 

a. Compute the cdf of X. 

b. Obtain an expression for the (100p)th percentile. 
What is the value of 2? 

c. Compute E(X) and V(X). 

d. If 1.5 thousand gallons are in stock at the beginning of 
the week and no new supply is due in during the week, 
how much of the 1.5 thousand gallons is expected to 
be left at the end of the week? [Hint: Let h(x) = 
amount left when demand = x.] 


23. If the temperature at which a certain compound melts is 
a random variable with mean value 120°C and standard 
deviation 2°C, what are the mean temperature and stan- 
dard deviation measured in °F? [Hint: °F = 1.8°C + 32.] 


24. Let X have the Pareto pdf 
k- 6k 


x20 


fs k, 6) =) 7) ae 
0 


x<@ 


introduced in Exercise 10. 

a. Ifk> 1, compute F(X). 

What can you say about E(X) if k = 1? 

If k > 2, show that V(X) = k07(k — 1)? (k — 2)71. 
If k = 2, what can you say about V(X)? 


What conditions on k are necessary to ensure that 
E(X") is finite? 


eae s 


Let X be the temperature in °C at which a certain chemi- 
cal reaction takes place, and let Y be the temperature in 
°F (so Y = 1.8X + 32). 

a. If the median of the X distribution is 2, show that 
1.8 + 32 is the median of the Y distribution. 

b. How is the 90th percentile of the Y distribution related 
to the 90th percentile of the X distribution? Verify 
your conjecture. 

c. More generally, if Y = aX + b, how is any particular 
percentile of the Y distribution related to the corre- 
sponding percentile of the X distribution? 


Let X be the total medical expenses (in 1000s of dollars) 

incurred by a particular individual during a given year. 

Although X is a discrete random variable, suppose its 

distribution is quite well approximated by a continuous 

distribution with pdf f(x) = k(1 + x/2.5)7 for x = 0. 

a. What is the value of k? 

b. Graph the pdf of X. 

c. What are the expected value and standard deviation 
of total medical expenses? 

d. This individual is covered by an insurance plan that 

entails a $500 deductible provision (so the first $500 
worth of expenses are paid by the individual). Then 
the plan will pay 80% of any additional expenses 
exceeding $500, and the maximum payment by the 
individual (including the deductible amount) is 
$2500. Let Y denote the amount of this individual’s 
medical expenses paid by the insurance company. 
What is the expected value of Y? 
(Hint: First figure out what value of X corresponds to 
the maximum out-of-pocket expense of $2500. Then 
write an expression for Y as a function of X (which 
involves several different pieces) and calculate the 
expected value of this function.] 


When a dart is thrown at a circular target, consider the loc- 
ation of the landing point relative to the bull’s eye. Let X be 
the angle in degrees measured from the horizontal, and 
assume that X is uniformly distributed on [0, 360]. Define 
Y to be the transformed variable Y= h(X) = 
(27r/360)X — 77, so Y is the angle measured in radians and 
Y is between —7 and w. Obtain E(Y) and oy by first obtain- 
ing E(X) and oy, and then using the fact that A(X) is a linear 
function of X. 


4.3. The Normal Distribution 


The normal distribution is the most important one in all of probability and statistics. 
Many numerical populations have distributions that can be fit very closely by an 
appropriate normal curve. Examples include heights, weights, and other physical 
characteristics (the famous 1903 Biometrika article “On the Laws of Inheritance in 
Man” discussed many examples of this sort), measurement errors in scientific 
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experiments, anthropometric measurements on fossils, reaction times in psycho- 
logical experiments, measurements of intelligence and aptitude, scores on various 
tests, and numerous economic measures and indicators. In addition, even when indi- 
vidual variables themselves are not normally distributed, sums and averages of the 
variables will under suitable conditions have approximately a normal distribution; 
this is the content of the Central Limit Theorem discussed in the next chapter. 


A continuous rv X is said to have a normal distribution with parameters pw 
and o (or pz and a), where —~© < up < © and 0 <a, if the pdf of X is 


FCG to) = e @-w/G0") _w<x<o (4.3) 


TO 


Again e denotes the base of the natural logarithm system and equals approximately 
2.71828, and a represents the familiar mathematical constant with approximate 
value 3.14159. The statement that X is normally distributed with parameters yw and 
a” is often abbreviated X ~ N(w, 7). 

Clearly f(x; uw, 0) 2 0, but a somewhat complicated calculus argument must 
be used to verify that |", f(x; u,0)dx = 1. It can be shown that E(X) = w and 
V(X) = 0”, so the parameters are the mean and the standard deviation of X. Fig- 
ure 4.13 presents graphs of f(x; , 7) for several different (jz, 0) pairs. Each density 
curve is symmetric about yw and bell-shaped, so the center of the bell (point of sym- 
metry) is both the mean of the distribution and the median. The mean p is a location 
parameter, since changing its value rigidly shifts the density curve to one side or the 
other; o is referred to as a scale parameter, because changing its value stretches or 
compresses the curve horizontally without changing the basic shape. The inflection 
points of a normal curve (points at which the curve changes from turning downward to 
turning upward) occur at ~ — o and yw + o. Thus the value of o can be visualized as 
the distance from the mean to these inflection points. A large value of o corresponds to 
a density curve that is quite spread out about jz, whereas a small value yields a highly 
concentrated curve. The larger the value of o, the more likely it is that a value of X far 
from the mean may be observed. 


40 60 80 100 120 mw pwto 
(a) (b) 


Figure 4.13 (a) Two different normal density curves (b) Visualizing yx and o for a normal 
distribution 
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The Standard Normal Distribution 


The computation of P(a = X = b) when X is a normal rv with parameters ps and a 
requires evaluating 


| 

| oO — W?/ 20) fy (4.4) 
a 270 

None of the standard integration techniques can be used to accomplish this. Instead, 

for w = 0 and o = 1, Expression (4.4) has been calculated using numerical tech- 

niques and tabulated for certain values of a and b. This table can also be used 

to compute probabilities for any other values of and o under consideration. 


DEFINITION The normal distribution with parameter values 4 = 0 and o = 1 is called the 
standard normal distribution. A random variable having a standard nor- 
mal distribution is called a standard normal random variable and will be 
denoted by Z. The pdf of Z is 


et? -w x ge & 


1 
f@ 0, 1) = 
V 207 
The graph of f(z; 0, 1) is called the standard normal (or z) curve. Its inflection 
points are at 1 and —1. The cdf of Z is P(Z = z) = {°, f(y; 0, 1) dy, which we 
will denote by ®(z). 


The standard normal distribution almost never serves as a model for a naturally 
arising population. Instead, it is a reference distribution from which information 
about other normal distributions can be obtained. Appendix Table A.3 gives 
@(z) = P(Z S z), the area under the standard normal density curve to the left of z, 
for z = —3.49, —3.48,..., 3.48, 3.49. Figure 4.14 illustrates the type of cumulative 
area (probability) tabulated in Table A.3. From this table, various other probabilities 
involving Z can be calculated. 


Shaded area = (z) 


Standard normal (z) curve 


a 


0 Zz 


Figure 4.14 Standard normal cumulative areas tabulated in Appendix Table A.3 


EXAMPLE 4.13 Let’s determine the following standard normal probabilities: (a) P(Z =< 1.25), 
(b) P(Z > 1.25), (c) P(Z S —1.25), (d) P(—.38 S Z S 1.25), and (e) P(Z = 5). 


a. P(Z = 1.25) = ®(1.25), a probability that is tabulated in Appendix Table A.3 at 
the intersection of the row marked 1.2 and the column marked .05. The number 
there is .8944, so P(Z < 1.25) = .8944. Figure 4.15(a) illustrates this probability. 


b. P(Z > 1.25) = 1 — P(Z = 1.25) = 1 — @(1.25), the area under the z curve 
to the right of 1.25 (an upper-tail area). Then (1.25) = .8944 implies that 
P(Z > 1.25) = .1056. Since Z is a continuous ry, P(Z = 1.25) = .1056. See 
Figure 4.15(b). 
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Shaded area = 0(1.25) 


Z curve 


Z curve 


0 1.25 
(b) 


Figure 4.15 Normal curve areas (probabilities) for Example 4.13 


c. P(Z = —1.25) = O(—1.25), a lower-tail area. Directly from Appendix Table 
A.3, B(—1.25) = .1056. By symmetry of the z curve, this is the same answer as 
in part (b). 

d. P(—.38 S Z = 1.25) is the area under the standard normal curve above the inter- 
val whose left endpoint is —.38 and whose right endpoint is 1.25. From Section 
4.2, if X is a continuous rv with cdf F(x), then P(a = X = b) = F(b) — F(a). 
Thus P(—.38 S Z S 1.25) = ®(1.25) — O(—.38) = .8944 — .3520 = 5424. 
(See Figure 4.16.) 


ee oad z sical — oe 


i | H 
T 
—.38 0 1.25 


i 
T 
—.38 0 


Figure 4.16 P(—.38 = Z < 1.25) as the difference between two cumulative areas 


e. P(Z = 5) = (5), the cumulative area under the z curve to the left of 5. This 
probability does not appear in the table because the last row is labeled 3.4. 
However, the last entry in that row is ®(3.49) = .9998. That is, essentially all of 
the area under the curve lies to the left of 3.49 (at most 3.49 standard deviations 
to the right of the mean). Therefore we conclude that P(Z = 5) ~ 1. | 


Percentiles of the Standard 
Normal Distribution 


For any p between 0 and 1, Appendix Table A.3 can be used to obtain the (100p)th 
percentile of the standard normal distribution. 


EXAMPLE 4.14 The 99th percentile of the standard normal distribution is that value on the horizon- 
tal axis such that the area under the z curve to the left of the value is .9900. Appendix 
Table A.3 gives for fixed z the area under the standard normal curve to the left of z, 
whereas here we have the area and want the value of z. This is the “inverse” prob- 
lem to P(Z = z) = ? so the table is used in an inverse fashion: Find in the middle of 
the table .9900; the row and column in which it lies identify the 99th z percentile. 
Here .9901 lies at the intersection of the row marked 2.3 and column marked .03, 
so the 99th percentile is (approximately) z = 2.33. (See Figure 4.17.) By symmetry, 
the first percentile is as far below 0 as the 99th is above 0, so equals —2.33 (1% lies 
below the first and also above the 99th). (See Figure 4.18.) 
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Shaded area = .9900 z curve 


z curve Shaded area = .01 


99th percentile —2.33 = Ist percentile 2.33 = 99th percentile 


Figure 4.17 Finding the 99th percentile Figure 4.18 The relationship between the ‘st 
and 99th percentiles 


In general, the (100p)th percentile is identified by the row and column of Appendix 
Table A.3 in which the entry p is found (e.g., the 67th percentile is obtained by finding 
.6700 in the body of the table, which gives z = .44). If p does not appear, the number 
closest to it is typically used, although linear interpolation gives a more accurate 
answer. For example, to find the 95th percentile, look for .9500 inside the table. 
Although it does not appear, both .9495 and .9505 do, corresponding to z = 1.64 
and 1.65, respectively. Since .9500 is halfway between the two probabilities that do 
appear, we will use 1.645 as the 95th percentile and — 1.645 as the 5th percentile. 


Zz, Notation for z Critical Values 


In statistical inference, we will need the values on the horizontal z axis that capture 
certain small tail areas under the standard normal curve. 


Notation 


zZ, will denote the value on the z axis for which a of the area under the z curve 
lies to the right of z,. (See Figure 4.19.) 


For example, z ,,) captures upper-tail area .10, and z,, captures upper-tail area .01. 


z curve Shaded area = P(Z =z.) =a@ 


*. 


Figure 4.19 z, notation Illustrated 


Since @ of the area under the z curve lies to the right of z,,1 — @ of the area 
lies to its left. Thus z, is the 100(1 — @)th percentile of the standard normal distri- 
bution. By symmetry the area under the standard normal curve to the left of —z, is 
also a. The z,’s are usually referred to as z critical values. Table 4.1 lists the most 


useful z percentiles and z, values. 
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Table 4.1 Standard Normal Percentiles and Critical Values 


Percentile 90 95 97.5, 99 99.5 99.9 99.95 

a@ (upper-tail area) il 05 025 O01 005 001 .0005 

Zq = 100(1 — a@)th 1.28 1.645 1.96 2.33 2.58 3.08 3.27 
percentile 


EXAMPLE 4.15  Z95 is the 100(1 — .05)th = 95th percentile of the standard normal distribution, so 
Zo5 = 1.645. The area under the standard normal curve to the left of —z,; is also 
.05. (See Figure 4.20.) 


zZ curve 


Shaded area = .05 \ 


Shaded area = .05 


—1.645 = —zo5 — Zo95 = 95th percentile = 1.645 


Figure 4.20 Finding Z,, | 


Nonstandard Normal Distributions 


When X ~ NM, a), probabilities involving X are computed by “standardizing.” The 
standardized variable is (X — y)/o. Subtracting pw shifts the mean from yp to zero, and 
then dividing by o scales the variable so that the standard deviation is | rather than o. 


PROPOSITION If X has a normal distribution with mean pw and standard deviation o, then 


ve 
Z= bg 


oO 


has a standard normal distribution. Thus 


oh — fi = 


a 


Pla =x=6)=P 


58) 058 


= — Se > — — bbe 
Px sa) =o = P(X=b)=1 0 = 


According to the first part of the proposition, the area under the normal (2, 07) curve 
that lies above the interval [a, b] is identical to the area under the standard normal curve 
that lies above the interval from the standardized lower limit (a — j2)/o to the standard- 
ized upper limit (b — )/o. An illustration of the second part appears in Figure 4.21. 
The key idea is that by standardizing, any probability involving X can be expressed as 
a probability involving a standard normal rv Z, so that Appendix Table A.3 can be used. 
The proposition can be proved by writing the cdf of Z = (X — p)/a as 


oztm 


PEs) = PK S024 =| L(x; b, ©) dx 


—00 
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N(w, 0) NO, 1) 


i a 0 
(a —pylo 


Figure 4.21 Equality of nonstandard and standard normal curve areas 


Using a result from calculus, this integral can be differentiated with respect to z to 
yield the desired pdf f(z; 0, 1). 


EXAMPLE 4.16 The time that it takes a driver to react to the brake lights on a decelerating vehicle 
is critical in helping to avoid rear-end collisions. The article ‘‘Fast-Rise Brake 
Lamp as a Collision-Prevention Device” (Ergonomics, 1993: 391-395) sug- 
gests that reaction time for an in-traffic response to a brake signal from stand- 
ard brake lights can be modeled with a normal distribution having mean value 
1.25 sec and standard deviation of .46 sec. What is the probability that reaction 
time is between 1.00 sec and 1.75 sec? If we let X denote reaction time, then 
standardizing gives 


1.00 = X = 1.75 
if and only if 


1.00-1.25 X-1.25 — 1.75 — 1.25 
< < 
46 46 46 


Thus 


46 46 
= P(—.54 < Z < 1.09) = (1.09) — &(—.54) 
= 8621 — .2946 = .5675 


1.00 — 1.25 1.75 — 1.25 
P(1.00 = X < 1.75) = P| "<7 < —— 


This is illustrated in Figure 4.22. Similarly, if we view 2 sec as a critically long reac- 
tion time, the probability that actual reaction time will exceed this value is 


= P(Z > 1.63) = 1 — ®(1.63) = .0516 


=) 
P(X>2)=P(Z> 


Normal, w = 1.25, c= .46 P(1.00 = X S 1.75) 


yi Z curve 


Figure 4.22 Normal curves for Example 4.16 @ 
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Standardizing amounts to nothing more than calculating a distance from the mean 
value and then reexpressing the distance as some number of standard deviations. 
Thus, if w = 100 and o = 15, then x = 130 corresponds to z = (130 — 100)/15 = 
30/15 = 2.00. That is, 130 is 2 standard deviations above (to the right of) the mean 
value. Similarly, standardizing 85 gives (85 — 100)/15 = —1.00, so 85 is | standard 
deviation below the mean. The z table applies to any normal distribution provided that 
we think in terms of number of standard deviations away from the mean value. 


EXAMPLE 4.17 The breakdown voltage of a randomly chosen diode of a particular type is known to 
be normally distributed. What is the probability that a diode’s breakdown voltage is 
within | standard deviation of its mean value? This question can be answered with- 
out knowing either yz or o, as long as the distribution is known to be normal; the 
answer is the same for any normal distribution: 


P(X is within | standard deviation of its mean) = Peu- 0 = XS pw+o) 


— — + ay 
=» o— oh +o “) 
Oo oO 
= P(—1.00 = Z = 1.00) 


= 0(1.00) — ®(—1.00) = .6826 


The probability that X is within 2 standard deviations of its mean is 
P(—2.00 S Z S 2.00) = .9544 and within 3 standard deviations of the mean is 
P(—3.00 S$ Z S 3.00) = .9974. a 


The results of Example 4.17 are often reported in percentage form and referred 
to as the empirical rule (because empirical evidence has shown that histograms of 
real data can very frequently be approximated by normal curves). 


If the population distribution of a variable is (approximately) normal, then 
1. Roughly 68% of the values are within 1 SD of the mean. 

2. Roughly 95% of the values are within 2 SDs of the mean. 

3. Roughly 99.7% of the values are within 3 SDs of the mean. 


It is indeed unusual to observe a value from a normal population that is much farther 
than 2 standard deviations from p. These results will be important in the develop- 
ment of hypothesis-testing procedures in later chapters. 


Percentiles of an Arbitrary 
Normal Distribution 


The (100p)th percentile of a normal distribution with mean py and standard deviation 
o is easily related to the (100p)th percentile of the standard normal distribution. 


PROPOSITION (100p)th percentile _ x (100p)th for : 
standard normal 


for normal (2, 7) 
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Another way of saying this is that if z is the desired percentile for the standard nor- 
mal distribution, then the desired percentile for the normal (jy, o) distribution is z 
standard deviations from jp. 


EXAMPLE 4.18 The authors of “Assessment of Lifetime of Railway Axle” (nil. J. of Fatigue, 
2013: 40-46) used data collected from an experiment with a specified initial crack 
length and number of loading cycles to propose a normal distribution with mean 
value 5.496 mm and standard deviation .067 mm for the rv X = final crack depth. 
For this model, what value of final crack depth would be exceeded by only .5% of all 
cracks under these circumstances? Let c denote the requested value. Then the desired 
condition is that P(X > c) = .005, or, equivalently, that P(X = c) = .995. Thus c is 
the 99.5th percentile of the normal distribution with wu = 5.496 and o = .067. The 
99.5th percentile of the standard normal distribution is 2.58, so 


c = 7(.995) = 5.496 + (2.58)(.067) = 5.496 + .173 = 5.669 mm 


This is illustrated in Figure 4.23. 


Shaded area = .995 


b= 5.496 


c = 99.5th percentile = 5.669 


Figure 4.23 Distribution of final crack depth for Example 4.18 | 


The Normal Distribution and 
Discrete Populations 


The normal distribution is often used as an approximation to the distribution of val- 
ues in a discrete population. In such situations, extra care should be taken to ensure 
that probabilities are computed in an accurate manner. 


EXAMPLE 4.19 IQ in a particular population (as measured by a standard test) is known to be 
approximately normally distributed with 4 = 100 and o = 15. What is the prob- 
ability that a randomly selected individual has an IQ of at least 125? Letting 
X = the IQ of a randomly chosen person, we wish P(X = 125). The temptation 
here is to standardize X = 125 as in previous examples. However, the IQ popula- 
tion distribution is actually discrete, since IQs are integer-valued. So the normal 
curve is an approximation to a discrete probability histogram, as pictured in 
Figure 4.24. 

The rectangles of the histogram are centered at integers. IQs of at least 125 
correspond to rectangles beginning at 124.5, as shaded in Figure 4.24. Thus we 
really want the area under the approximating normal curve to the right of 124.5. 
Standardizing this value gives P(Z = 1.63) = .0516, whereas standardizing 125 
results in P(Z = 1.67) = .0475. The difference is not great, but the answer .0516 is 
more accurate. Similarly, P(X = 125) would be approximated by the area between 
124.5 and 125.5, since the area under the normal curve above the single value 125 
is zero. 
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125 
Figure 4.24 A normal approximation to a discrete distribution | 


The correction for discreteness of the underlying distribution in Exam- 
ple 4.19—that is, the addition or subtraction of .5 before standardizing—is often 
called a continuity correction. It is useful in the following application of the normal 
distribution to the computation of binomial probabilities. 


Approximating the Binomial Distribution 


Recall that the mean value and standard deviation of a binomial random vari- 
able X are wy = np and oy = Vnpq, respectively. Figure 4.25 displays a bino- 
mial probability histogram for the binomial distribution with n = 25, p = .6, for 
which pw = 25(.6) = 15 and 0 = V 25(.6)(.4) = 2.449. A normal curve with this 
p and o has been superimposed on the probability histogram. Although the prob- 
ability histogram is a bit skewed (because p # .5), the normal curve gives a very 
good approximation, especially in the middle part of the picture. The area of any 
rectangle (probability of any particular X value) except those in the extreme tails can 
be accurately approximated by the corresponding normal curve area. For example, 
P(X = 10) = B10; 25, .6) — B(9; 25, .6) = .021, whereas the area under the normal 
curve between 9.5 and 10.5 is P(—2.25 S Z S —1.84) = .0207. 


Distribution n p 
C1 Binomial 25 0.6 
Distribution Mean StDev 
— Normal 15 2.449 


0.18 4 


0.16 4 


0.14 4 


0.12 4 


0.10 4 


Density 


0.08 + 


0.06 + 


0.04 + 


0.02 + 


0.00 = 


Figure 4.25 Binomial probability histogram for n = 25, p = .6 with normal approximation 
curve superimposed 
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More generally, as long as the binomial probability histogram is not too 
skewed, binomial probabilities can be well approximated by normal curve areas. It 
is then customary to say that X has approximately a normal distribution. 


PROPOSITION Let X be a binomial rv based on n trials with success probability p. Then if 
the binomial probability histogram is not too skewed, X has approximately a 
normal distribution with w = np and o = V npg. In particular, for x = a pos- 
sible value of X, 


area under the normal curve 
IROK S39) S IACI) ( 


to the left of x + .5 


o(" + 5- 2) 
V pq 


In practice, the approximation is adequate provided that both np = 10 and 
nq = 10 (i.e., the expected number of successes and the expected number 
of failures are both at least 10), since there is then enough symmetry in the 
underlying binomial distribution. 


A direct proof of the approximation’s validity is quite difficult. In the next chapter 
we'll see that it is a consequence of a more general result called the Central Limit 
Theorem. In all honesty, the approximation is not so important for probability cal- 
culation as it once was. This is because software can now calculate binomial prob- 
abilities exactly for quite large values of n. 


EXAMPLE 4.20 ‘Suppose that 25% of all students at a large public university receive financial aid. Let 
X be the number of students in a random sample of size 50 who receive financial 
aid, so that p = .25. Then w = 12.5 and o = 3.06. Since np = 50(.25) = 12.5 = 10 
and ng = 37.5 = 10, the approximation can safely be applied. The probability that 
at most 10 students receive aid is 


3.06 
= P(—.65) = .2578 


10'+ 5: = 12:5 
P(X = 10) = B10; 50, .25) ~ of 245-25) 


Similarly, the probability that between 5 and 15 (inclusive) of the selected students 
receive aid is 


P(5 = X = 15) = BIS; 50, .25) — B(4; 50, .25) 


15.5 — 12.5 4.5 = 12.5 
~@ ® = 832 
3.06 ) 3.06 ) ee 


The exact probabilities are .2622 and .8348, respectively, so the approximations are 
quite good. In the last calculation, P(S = X = 15) is being approximated by the area 
under the normal curve between 4.5 and 15.5—the continuity correction is used for 
both the upper and lower limits. fa 


When the objective of our investigation is to make an inference about a popula- 
tion proportion p, interest will focus on the sample proportion of successes X/n rather 
than on X itself. Because this proportion is just X multiplied by the constant I/n, it will 
also have approximately a normal distribution (with mean zs = p and standard devia- 
tion o = Vpq/n) provided that both np = 10 and ng = 10. This normal approxima- 
tion is the basis for several inferential procedures to be discussed in later chapters. 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


4.3 The Normal Distribution 167 


EXERCISES Section 4.3 (28-58) 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


Let Z be a standard normal random variable and calculate 
the following probabilities, drawing pictures wherever 
appropriate. 


a PO=<Z<2.17) b. PO<Z<1) 
ec. P(—2.50<Z<0) d. P(—2.50 < Z < 2.50) 
e. P(Z< 1.37) f. P(-1.75 <Z) 

g. P(-1.50<Z<2.00) h. P(1.37<Z<2.50) 

i. P(1.50<Z) j. P(\Z| < 2.50) 


In each case, determine the value of the constant c that 
makes the probability statement correct. 

a. D(c) = .9838 b POSZsc)=.291 
e« P(cSZ)=.121 d. P(-c=Z Sc) = .668 
e. P(c = |Z|) = .016 


Find the following percentiles for the standard normal 
distribution. Interpolate where appropriate. 


a. 91st b. 9th ce. 75th 

d. 25th e. 6th 

Determine z, for the following values of a: 
a. a = .0055 b. a=.09 

c. a = .663 


Suppose the force acting on a column that helps to sup- 
port a building is a normally distributed random variable 
X with mean value 15.0 kips and standard deviation 
1.25 kips. Compute the following probabilities by stan- 
dardizing and then using Table A.3. 

a. P(X = 15) b. P(X S 17.5) 

ce. P(X = 10) d. P(4=xX <= 18) 

e. P(|X — 15| <3) 


Mopeds (small motorcycles with an engine capacity 
below 50 cm*) are very popular in Europe because of their 
mobility, ease of operation, and low cost. The article 
“Procedure to Verify the Maximum Speed of Automatic 
Transmission Mopeds in Periodic Motor Vehicle 
Inspections” (J. of Automobile Engr., 2008: 1615-1623) 
described a rolling bench test for determining maximum 
vehicle speed. A normal distribution with mean value 
46.8 km/h and standard deviation 1.75 km/h is postulated. 
Consider randomly selecting a single such moped. 
a. What is the probability that maximum speed is at 
most 50 km/h? 
b. What is the probability that maximum speed is at 
least 48 km/h? 
c. What is the probability that maximum speed differs 
from the mean value by at most 1.5 standard deviations? 


The article “Reliability of Domestic-Waste Biofilm 
Reactors” (J. of Envir. Engr., 1995: 785-790) suggests 
that substrate concentration (mg/cm) of influent to a 
reactor is normally distributed with w = .30 anda = .06. 


35. 


36. 


37. 


a. What is the probability that the concentration exceeds 
50? 

b. What is the probability that the concentration is at 
most .20? 


c. How would you characterize the largest 5% of all 
concentration values? 


In a road-paving process, asphalt mix is delivered to the 
hopper of the paver by trucks that haul the material from the 
batching plant. The article ‘Modeling of Simultaneously 
Continuous and Stochastic Construction Activities for 
Simulation” (J. of Construction Engr. and Mgmnt., 
2013: 1037-1045) proposed a normal distribution with 
mean value 8.46 min and standard deviation .913 min for 
the rv X = truck haul time. 


a. What is the probability that haul time will be at least 
10 min? Will exceed 10 min? 

b. What is the probability that haul time will exceed 
15 min? 

c. What is the probability that haul time will be 
between 8 and 10 min? 


d. What value c is such that 98% of all haul times are in 
the interval from 8.46 — c to 8.46 + c? 


e. If four haul times are independently selected, what is 
the probability that at least one of them exceeds 10 min? 


Spray drift is a constant concern for pesticide applicators 
and agricultural producers. The inverse relationship 
between droplet size and drift potential is well known. 
The paper “Effects of 2,4-D Formulation and 
Quinclorac on Spray Droplet Size and Deposition” 
(Weed Technology, 2005: 1030-1036) investigated the 
effects of herbicide formulation on spray atomization. A 
figure in the paper suggested the normal distribution with 
mean 1050 wm and standard deviation 150 wm was a 
reasonable model for droplet size for water (the “control 
treatment’’) sprayed through a 760 ml/min nozzle. 


a. What is the probability that the size of a single drop- 
let is less than 1500 wm? At least 1000 wm? 


b. What is the probability that the size of a single drop- 
let is between 1000 and 1500 wm? 


c. How would you characterize the smallest 2% of 
all droplets? 


d. If the sizes of five independently selected droplets 
are measured, what is the probability that exactly 
two of them exceed 1500 wm? 


Suppose that blood chloride concentration (mmol/L) has 
a normal distribution with mean 104 and standard devia- 
tion 5 (information in the article “Mathematical Model 
of Chloride Concentration in Human Blood,” J. of 
Med. Engr. and Tech., 2006: 25-30, including a normal 
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38. 


39. 


40. 


41. 


42. 


probability plot as described in Section 4.6, supports this 

assumption). 

a. What is the probability that chloride concentration 
equals 105? Is less than 105? Is at most 105? 

b. What is the probability that chloride concentration 
differs from the mean by more than 1 standard 
deviation? Does this probability depend on the 
values of and a? 

c. How would you characterize the most extreme .1% 
of chloride concentration values? 


There are two machines available for cutting corks intended 
for use in wine bottles. The first produces corks with diam- 
eters that are normally distributed with mean 3 cm and 
standard deviation .1 cm. The second machine produces 
corks with diameters that have a normal distribution with 
mean 3.04 cm and standard deviation .02 cm. Acceptable 
corks have diameters between 2.9 cm and 3.1 cm. Which 
machine is more likely to produce an acceptable cork? 


The defect length of a corrosion defect in a pressurized 

steel pipe is normally distributed with mean value 30 mm 

and standard deviation 7.8 mm [suggested in the article 

“Reliability Evaluation of Corroding Pipelines 

Considering Multiple Failure Modes and Time- 

Dependent Internal Pressure” (J. of Infrastructure 

Systems, 2011: 216-224)]. 

a. What is the probability that defect length is at most 
20 mm? Less than 20 mm? 

b. What is the 75th percentile of the defect length dis- 
tribution—that is, the value that separates the small- 
est 75% of all lengths from the largest 25%? 

c. What is the 15th percentile of the defect length 
distribution? 

d. What values separate the middle 80% of the defect 
length distribution from the smallest 10% and the 
largest 10%? 


The article “(Monte Carlo Simulation—Tool for Better 

Understanding of LRFD” (J. of Structural Engr., 

1993: 1586-1599) suggests that yield strength (ksi) for 

A36 grade steel is normally distributed with » = 43 and 

ao = 45. 

a. What is the probability that yield strength is at most 
40? Greater than 60? 

b. What yield strength value separates the strongest 
75% from the others? 


The automatic opening device of a military cargo para- 
chute has been designed to open when the parachute is 
200 m above the ground. Suppose opening altitude 
actually has a normal distribution with mean value 
200 m and standard deviation 30 m. Equipment dam- 
age will occur if the parachute opens at an altitude of 
less than 100 m. What is the probability that there is 
equipment damage to the payload of at least one of five 
independently dropped parachutes? 


The temperature reading from a thermocouple placed in 
a constant-temperature medium is normally distributed 
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43. 


44. 


45. 


46. 


with mean p, the actual temperature of the medium, 
and standard deviation 0. What would the value of 7 
have to be to ensure that 95% of all readings are within 
1° of w? 


Vehicle speed on a particular bridge in China can be 

modeled as normally distributed (‘‘Fatigue Reliability 

Assessment for Long-Span Bridges under Combined 

Dynamic Loads from Winds and Vehicles,’ J. of 

Bridge Engr., 2013: 735-747). 

a. If 5% of all vehicles travel less than 39.12 m/h and 
10% travel more than 73.24 m/h, what are the mean 
and standard deviation of vehicle speed? [Note: The 
resulting values should agree with those given in the 
cited article.] 

b. What is the probability that a randomly selected vehi- 
cle’s speed is between 50 and 65 m/h? 

c. What is the probability that a randomly selected vehi- 
cle’s speed exceeds the speed limit of 70 m/h? 


If bolt thread length is normally distributed, what is the 
probability that the thread length of a randomly selected 
bolt is 

a. Within 1.5 SDs of its mean value? 

b. Farther than 2.5 SDs from its mean value? 

c. Between | and 2 SDs from its mean value? 


A machine that produces ball bearings has initially 
been set so that the true average diameter of the bear- 
ings it produces is .500 in. A bearing is acceptable if 
its diameter is within .004 in. of this target value. 
Suppose, however, that the setting has changed during 
the course of production, so that the bearings have 
normally distributed diameters with mean value .499 
in. and standard deviation .002 in. What percentage of 
the bearings produced will not be acceptable? 


The Rockwell hardness of a metal is determined by 
impressing a hardened point into the surface of the 
metal and then measuring the depth of penetration of the 
point. Suppose the Rockwell hardness of a particular 
alloy is normally distributed with mean 70 and standard 

deviation 3. 

a. If a specimen is acceptable only if its hardness is 
between 67 and 75, what is the probability that a ran- 
domly chosen specimen has an acceptable hardness? 

b. Ifthe acceptable range of hardnessis(70 — c, 70 + c), 
for what value of c would 95% of all specimens have 
acceptable hardness? 

c. If the acceptable range is as in part (a) and the hard- 
ness of each of ten randomly selected specimens is 
indepen-dently determined, what is the expected 
number of acceptable specimens among the ten? 

d. What is the probability that at most eight of ten inde- 
pendently selected specimens have a hardness of less 
than 73.84? [Hint: Y = the number among the ten 
specimens with hardness less than 73.84 is a binomial 
variable; what is p?] 


47. 


48. 


49. 


50. 


The weight distribution of parcels sent in a certain manner 
is normal with mean value 12 lb and standard deviation 
3.5 Ib. The parcel service wishes to establish a weight 
value c beyond which there will be a surcharge. What 
value of c is such that 99% of all parcels are at least 1 lb 
under the surcharge weight? 


Suppose Appendix Table A.3 contained ®(z) only for 
z = 0. Explain how you could still compute 

a. P(—1.72=Zs —.55) 

b. P(-1.72 = ZS .55) 


Is it necessary to tabulate D(z) for z negative? What prop- 
erty of the standard normal curve justifies your answer? 


Consider babies born in the “normal” range of 37-43 

weeks gestational age. Extensive data supports the 

assumption that for such babies born in the United 

States, birth weight is normally distributed with mean 

3432 g and standard deviation 482 g. [The article “Are 

Babies Normal?” (The American Statistician, 1999: 

298-302) analyzed data from a particular year; for a 

sensible choice of class intervals, a histogram did not 

look at all normal, but after further investigations it 
was determined that this was due to some hospitals 
measuring weight in grams and others measuring to 

the nearest ounce and then converting to grams. A 

modified choice of class intervals that allowed for this 

gave a histogram that was well described by a normal 
distribution. ] 

a. What is the probability that the birth weight of a 
randomly selected baby of this type exceeds 4000 g? 
Is between 3000 and 4000 g? 

b. What is the probability that the birth weight of a ran- 
domly selected baby of this type is either less than 
2000 g or greater than 5000 g? 

c. What is the probability that the birth weight of a ran- 
domly selected baby of this type exceeds 7 Ib? 

d. How would you characterize the most extreme .1% 
of all birth weights? 

e. If X is a random variable with a normal distribution 
and a is a numerical constant (a # 0), then Y = aX 
also has a normal distribution. Use this to determine 
the distribution of birth weight expressed in pounds 
(shape, mean, and standard deviation), and then 
recalculate the probability from part (c). How does 
this compare to your previous answer? 


In response to concerns about nutritional contents of 
fast foods, McDonald’s has announced that it will use a 
new cooking oil for its french fries that will decrease 
substantially trans fatty acid levels and increase the 
amount of more beneficial polyunsaturated fat. The com- 
pany claims that 97 out of 100 people cannot detect a 
difference in taste between the new and old oils. Assuming 
that this figure is correct (as a long-run proportion), 
what is the approximate probability that in a random 
sample of 1000 individuals who have purchased fries at 
McDonald’s, 


51. 


52. 


53. 


54. 


55. 


56. 


57. 
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a. At least 40 can taste the difference between the two 
oils? 

b. At most 5% can taste the difference between the 
two oils? 


Chebyshev’s inequality, (see Exercise 44, Chapter 3), is valid 
for continuous as well as discrete distributions. It states that 
for any number k satisfying k = 1,P(|X — p| = ko) S W/k? 
(see Exercise 44 in Chapter 3 for an interpretation). Obtain 
this probability in the case of a normal distribution for 
k = 1, 2, and 3, and compare to the upper bound. 


Let X denote the number of flaws along a 100-m reel of 
magnetic tape (an integer-valued variable). Suppose X 
has approximately a normal distribution with pw = 25 
and o = 5. Use the continuity correction to calculate the 
probability that the number of flaws is 

a. Between 20 and 30, inclusive. 

b. At most 30. Less than 30. 


Let X have a binomial distribution with parameters 
n = 25 and p. Calculate each of the following probabil- 
ities using the normal approximation (with the continu- 
ity correction) for the cases p= .5, .6, and .8 and 
compare to the exact probabilities calculated from 
Appendix Table A.1. 

a. P15 = X = 20) 

b. P(X = 15) 

ce. P(20 =X) 


Suppose that 10% of all steel shafts produced by a cer- 
tain process are nonconforming but can be reworked 
(rather than having to be scrapped). Consider a random 
sample of 200 shafts, and let X denote the number among 
these that are nonconforming and can be reworked. What 
is the (approximate) probability that X is 

a. At most 30? 

b. Less than 30? 


c. Between 15 and 25 (inclusive)? 


Suppose only 75% of all drivers in a certain state regu- 

larly wear a seat belt. A random sample of 500 drivers is 

selected. What is the probability that 

a. Between 360 and 400 (inclusive) of the drivers in the 
sample regularly wear a seat belt? 

b. Fewer than 400 of those in the sample regularly wear 
a seat belt? 


Show that the relationship between a general normal 
percentile and the corresponding z percentile is as stated 
in this section. 


a. Show that if X has a normal distribution with 
parameters yw and o, then Y= aX + b (a linear 
function of X) also has a normal distribution. What 
are the parameters of the distribution of Y [i.e., 
E(Y) and V(Y)]? [Hint: Write the cdf of Y, P(Y = y), 
as an integral involving the pdf of X, and then 
differentiate with respect to y to get the pdf of Y.] 
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b. If, when measured in °C, temperature is normally P(Z=z) =1- Pz) 
distributed with mean 115 and standard deviation 2, 
what can be said about the distribution of temperature = 5ex | | Fae 2} 
measured in °F? 703/z + 165 
58. There is no nice formula for the standard normal cdf ®(z), The relative error of this approximation is less than 
but several good approximations have been published in arti- .042%. Use this to calculate approximations to the fol- 
cles. The following is from “Approximations for Hand lowing probabilities, and compare whenever possible to 
Calculators Using Small Integer Coefficients” the probabilities obtained from Appendix Table A.3. 
(Mathematics of Computation, 1977: 214-222). For a. P(Z= 1) b. P(Z < —3) 
0<z=55, c. P(-4<Z<4) d. P(Z> 5) 


4.4 The Exponential and Gamma Distributions 


The density curve corresponding to any normal distribution is bell-shaped and 
therefore symmetric. There are many practical situations in which the variable of 
interest to an investigator might have a skewed distribution. One family of distribu- 
tions that has this property is the gamma family. We first consider a special case, the 
exponential distribution, and then generalize later in the section. 


The Exponential Distribution 


The family of exponential distributions provides probability models that are very 
widely used in engineering and science disciplines. 


DEFINITION X is said to have an exponential distribution with (scale) parameter A (A > 0) 
if the pdf of X is 


Neu ee) 


JCB) = (4.5) 


0 otherwise 


Some sources write the exponential pdf in the form (1/B)e~*/8, so that B = 1/A. The 
expected value of an exponentially distributed random variable X is 


oo 


Bp=EXY= | xhe™ dx 
0 
Obtaining this expected value necessitates doing an integration by parts. The vari- 
ance of X can be computed using the fact that V(X) = E(X”) — [E(X)]*. The deter- 
mination of E(X7) requires integrating by parts twice in succession. The results of 
these integrations are as follows: 


1 1 


==. 222. 
a re 

Both the mean and standard deviation of the exponential distribution equal 1/A. 

Graphs of several exponential pdf’s are illustrated in Figure 4.26. 
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EXAMPLE 4.21 


PROPOSITION 
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f(x A) 4 
2 


eX. 


Figure 4.26 Exponential density curves 


The exponential pdf is easily integrated to obtain the cdf. 


The article “Probabilistic Fatigue Evaluation of Riveted Railway Bridges” (J. of 
Bridge Engr., 2008: 237-244) suggested the exponential distribution with mean value 
6 MPa as a model for the distribution of stress range in certain bridge connections. Let’s 
assume that this is in fact the true model. Then E(X) = 1/A = 6 implies that A = .1667. 
The probability that stress range is at most 10 MPa is 


P(X = 10) = F(10; .1667) = 1 — e@16670® = 1 — 189 = .811 
The probability that stress range is between 5 and 10 MPa is 


P(S5 = X = 10) = F(10; .1667) — F(5; .1667) = (1 — e757) — (1 — 8335) 
= 246 = 


The exponential distribution is frequently used as a model for the distribution 
of times between the occurrence of successive events, such as customers arriving at 
a service facility or calls coming in to a switchboard. The reason for this is that the 
exponential distribution is closely related to the Poisson process discussed in Chapter 3. 


Suppose that the number of events occurring in any time interval of length ¢ 
has a Poisson distribution with parameter at (where a, the rate of the event 
process, is the expected number of events occurring in | unit of time) and that 
numbers of occurrences in nonoverlapping intervals are independent of one 
another. Then the distribution of elapsed time between the occurrence of two 
successive events is exponential with parameter A = a. 


Although a complete proof is beyond the scope of the text, the result is easily veri- 
fied for the time X, until the first event occurs: 


P(X, =) =1- PX, > 1) = 1 — Pino events in (0, 1)] 
age a 
0! 


which is exactly the cdf of the exponential distribution. 


=] le * 
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EXAMPLE 4.22 Suppose that calls to a rape crisis center in a certain county occur according to a 
Poisson process with rate a = .5 call per day. Then the number of days X between 
successive calls has an exponential distribution with parameter value .5, so the prob- 
ability that more than 2 days elapse between calls is 


P(X > 2) =1— P(X <2) =1— F(2;.5) = ec O® = 368 


The expected time between successive calls is 1/.5 = 2 days. (| 


Another important application of the exponential distribution is to model 
the distribution of component lifetime. A partial reason for the popularity of 
such applications is the ‘“‘memoryless” property of the exponential distribution. 
Suppose component lifetime is exponentially distributed with parameter A. After 
putting the component into service, we leave for a period of f) hours and then return 
to find the component still working; what now is the probability that it lasts at least 
an additional t hours? In symbols, we wish P(X = t + t)|X = t)). By the definition 
of conditional probability, 


PUX2t+h)NX2H)] 


PX=t+t|X=t) = PXE I) 
220) 


But the event X = f, in the numerator is redundant, since both events can occur if 
and only if X = t+ f. Therefore, 


PX=t+t) 1-FE+HsA) _ 
P(X = tp) 1 — F(t; A) 


PX=t+t,|X=t)= a 


This conditional probability is identical to the original probability P(X = 1) that the 
component lasted tf hours. Thus the distribution of additional lifetime is exactly the 
same as the original distribution of lifetime, so at each point in time the component 
shows no effect of wear. In other words, the distribution of remaining lifetime is 
independent of current age. 

Although the memoryless property can be justified at least approximately 
in many applied problems, in other situations components deteriorate with age or 
occasionally improve with age (at least up to a certain point). More general lifetime 
models are then furnished by the gamma, Weibull, and lognormal distributions (the 
latter two are discussed in the next section). 


The Gamma Function 


To define the family of gamma distributions, we first need to introduce a function 
that plays an important role in many branches of mathematics. 


DEFINITION For a > 0, the gamma function I'(q) is defined by 


Te | ee le de (4.6) 
0 


The most important properties of the gamma function are the following: 
1. For any a > 1, [(a@) = (a — 1) - I'(a — 1) [via integration by parts] 
2. For any positive integer, n, [(n) = (n — 1)! 


3. Td/2) = Va 
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Now let 
xe Io-x = é 
Ke 
fia) = 4 Te) (4.7) 
0 otherwise 


Then f(x; aw) = 0. Expression (4.6) implies that hf; a) dx = T'(a)/T(a) = 1. Thus 
(x; @) satisfies the two basic properties of a pdf. 


The Gamma Distribution 


DEFINITION A continuous random variable X is said to have a gamma distribution if the 
pdf of X is 
ae 0 
Se meal one ag 
SQ; a, B) = 4 BT (a) (4.8) 


0 otherwise 


where the parameters a and B satisfy a > 0, B > 0. The standard gamma 
distribution has 6 = 1, so the pdf of a standard gamma rv is given by (4.7). 


The exponential distribution results from taking a = | and B = I/d. 

Figure 4.27(a) illustrates the graphs of the gamma pdf f(x; a, B) (4.8) for sev- 
eral (a, B) pairs, whereas Figure 4.27(b) presents graphs of the standard gamma pdf. 
For the standard pdf, when a = 1, f(x; a) is strictly decreasing as x increases from 0; 
when a > 1, f(x; a) rises from 0 at x = 0 to a maximum and then decreases. The 
parameter 6 in (4.8) is a scale parameter, and a is referred to as a shape parameter 
because changing its value alters the basic shape of the density curve. 


(x; a) 4 


1.0 


0.5 4 


Figure 4.27 (a) Gamma density curves; (b) standard gamma density curves 


The mean and variance of a random variable X having the gamma distribution 


fl; a, B) are 
EX) ==aB V(X) = 0? = af? 
When X is a standard gamma rv, the cdf of X, 


X ya-ls-y 


y 
Fa; = 
ae k T(q@) 


is called the incomplete gamma function [sometimes the incomplete gamma func- 
tion refers to Expression (4.9) without the denominator I’(q) in the integrand]. There 


dy x>0 (4.9) 
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are extensive tables of F(x; a) available; in Appendix Table A.4, we present a small 
tabulation for a = 1, 2,..., 10 and x = 1, 2,..., 15. 


EXAMPLE 4.23 = The article “The Probability Distribution of Maintenance Cost of a System Affected 
by the Gamma Process of Degradation” (Reliability Engr. and System Safety, 2012: 
65-76) notes that the gamma distribution is widely used to model the extent of degrada- 
tion such as corrosion, creep, or wear. Let X represent the amount of degradation of a 
certain type, and suppose that it has a standard gamma distribution with a = 2. Since 


PiasX Sb) = F(b) — Fla) 
when X is continuous, 
P3 =X =5) = FG; 2) — FG; 2) = .960 — .801 = .159 
The probability that the amount of degradation exceeds 4 is 
P(X >4=1- PX $4) =1- F(4; 2) = 1 — .908 = .092 a 
The incomplete gamma function can also be used to compute probabilities 


involving nonstandard gamma distributions. These probabilities can also be obtained 
almost instantaneously from various software packages. 


PROPOSITION Let X have a gamma distribution with parameters a and 8. Then for any x > 0, 
the cdf of X is given by 


P(X = x) = F(x; a, B) = A(z: «| 


where F( - ; a) is the incomplete gamma function. 


EXAMPLE 4.24 Suppose the survival time X in weeks of a randomly selected male mouse exposed 
to 240 rads of gamma radiation has (what else!) a gamma distribution with 
a = 8 and B = 15. (Data in Survival Distributions: Reliability Applications in 
the Biomedical Services, by A. J. Gross and V. Clark, suggests a ~ 8.5 and 
B =~ 13.3.) The expected survival time is E(X) = (8)(15) = 120 weeks, whereas 
V(X) = (8)(15)? = 1800 and oy = V1800 = 42.43 weeks. The probability that a 
mouse survives between 60 and 120 weeks is 


P(60 < X < 120) = P(X < 120) — P(X < 60) 
= F(120/15; 8) — F(60/15; 8) 
= F(8:8) — F(4:8) = .547 — .051 = .496 


The probability that a mouse survives at least 30 weeks is 
P(X = 30) = 1 — P(X < 30) = 1 — P(X S$ 30) 
= 1 — F(30/15; 8) = .999 ie 


The Chi-Squared Distribution 


The chi-squared distribution is important because it is the basis for a number of 
procedures in statistical inference. The central role played by the chi-squared 
distribution in inference springs from its relationship to normal distributions (see 
Exercise 71). We’ll discuss this distribution in more detail in later chapters. 
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DEFINITION 


Let v be a positive integer. Then a random variable X is said to have a chi- 
squared distribution with parameter v if the pdf of X is the gamma density 
with a = v/2 and B = 2. The pdf of a chi-squared rv is thus 


SQ v) = 


The parameter v is called the number of degrees of freedom (df) of X. The 
symbol x” is often used in place of “chi-squared.” 


1 
2"/T(v/2) 


(v/2)—1p-x/2 x=0 


4.10 
0 see (0) ( ) 


EXERCISES Section 4.4 (59-71) 


59. 


60. 


61. 


62. 


Let X = the time between two successive arrivals at the 
drive-up window of a local bank. If X has an exponential 
distribution with A = 1 (which is identical to a standard 
gamma distribution with a = 1), compute the following: 
a. The expected time between two successive arrivals 
b. The standard deviation of the time between succes- 
sive arrivals 
ce P(X =4) d. P(2=xX <5) 


Let X denote the distance (m) that an animal moves from 
its birth site to the first territorial vacancy it encounters. 
Suppose that for banner-tailed kangaroo rats, X has an 
exponential distribution with parameter A = .01386 (as 
suggested in the article ““Competition and Dispersal 
from Multiple Nests,” Ecology, 1997: 873-883). 
a. What is the probability that the distance is at most 
100 m? At most 200 m? Between 100 and 200 m? 
b. What is the probability that distance exceeds the 
mean distance by more than 2 standard deviations? 
c. What is the value of the median distance? 


Data collected at Toronto Pearson International Airport 
suggests that an exponential distribution with mean value 
2.725 hours is a good model for rainfall duration (Urban 

Stormwater Management Planning with Analytical 

Probabilistic Models, 2000, p. 69). 

a. What is the probability that the duration of a partic- 
ular rainfall event at this location is at least 2 hours? 
At most 3 hours? Between 2 and 3 hours? 

b. What is the probability that rainfall duration exceeds 
the mean value by more than 2 standard deviations? 
What is the probability that it is less than the mean 
value by more than one standard deviation? 


The article “Microwave Observations of Daily 
Antarctic Sea-Ice Edge Expansion and Contribution 
Rates” (IEEE Geosci. and Remote Sensing Letters, 
2006: 54-58) states that “The distribution of the daily 


63. 


64. 


65. 


sea-ice advance/retreat from each sensor is similar and 

is approximately double exponential.” The proposed 

double exponential distribution has density function 

f(x) = 5Ae*"! for —0 <x <0, The standard devia- 

tion is given as 40.9 km. 

a. What is the value of the parameter A? 

b. What is the probability that the extent of daily sea- 
ice change is within | standard deviation of the mean 
value? 


A consumer is trying to decide between two long-dis- 
tance calling plans. The first one charges a flat rate of 
10¢ per minute, whereas the second charges a flat rate of 
99¢ for calls up to 20 minutes in duration and then 10¢ 
for each additional minute exceeding 20 (assume that 
calls lasting a noninteger number of minutes are charged 
proportionately to a whole-minute’s charge). Suppose the 
consumer’s distribution of call duration is exponential 
with parameter 2. 
a. Explain intuitively how the choice of calling plan 
should depend on what the expected call duration is. 
b. Which plan is better if expected call duration is 
10 minutes? 15 minutes? [Hint: Let h,(x) denote the 
cost for the first plan when call duration is x minutes 
and let h,(x) be the cost function for the second plan. 
Give expressions for these two cost functions, and then 
determine the expected cost for each plan.] 


Evaluate the following: 

a. [(6) b. [(5/2) 

c. F(4; 5) (the incomplete gamma function) and F(5; 4) 

d. P(X = 5) when X has a standard gamma distribution 
with a = 7. 

e. P(3 <X < 8) when X has the distribution specified 
in (d). 

Let X denote the data transfer time (ms) in a grid com- 

puting system (the time required for data transfer 
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66. 


67. 


68. 


between a “worker” computer and a “master” computer. 
Suppose that X has a gamma distribution with mean 
value 37.5 ms and standard deviation 21.6 (suggested by 
the article “Computation Time of Grid Computing 
with Data Transfer Times that Follow a Gamma 
Distribution,” Proceedings of the First International 
Conference on Semantics, Knowledge, and Grid, 2005). 
a. What are the values of a and B? 
b. What is the probability that data transfer time exceeds 
50 ms? 
c. What is the probability that data transfer time is 
between 50 and 75 ms? 


The two-parameter gamma distribution can be general- 
ized by introducing a third parameter y, called a thresh- 
old or location parameter: replace x in (4.8) by x — y and 
x = 0 by x = y. This amounts to shifting the density 
curves in Figure 4.27 so that they begin their ascent or 
descent at y rather than 0. The article “Bivariate Flood 
Frequency Analysis with Historical Information 
Based on Copulas” (J. of Hydrologic Engr., 2013: 
1018-1030) employs this distribution to model X = 
3-day flood volume (10° m*). Suppose that values of the 
parameters are a = 12, B = 7, y = 40 (very close to 
estimates in the cited article based on past data). 
a. What are the mean value and standard deviation of X? 
b. What is the probability that flood volume is between 
100 and 150? 
c. What is the probability that flood volume exceeds its 
mean value by more than one standard deviation? 
d. What is the 95th percentile of the flood volume 
distribution? 


Suppose that when a transistor of a certain type is sub- 
jected to an accelerated life test, the lifetime X (in weeks) 
has a gamma distribution with mean 24 weeks and stan- 
dard deviation 12 weeks. 

a. What is the probability that a transistor will last 
between 12 and 24 weeks? 

b. What is the probability that a transistor will last at 
most 24 weeks? Is the median of the lifetime distri- 
bution less than 24? Why or why not? 

c. What is the 99th percentile of the lifetime distribution? 

d. Suppose the test will actually be terminated after ¢ 
weeks. What value of f is such that only .5% of all 
transistors would still be operating at termination? 


The special case of the gamma distribution in which a is 

a positive integer n is called an Erlang distribution. If we 
replace B by 1/A in Expression (4.8), the Erlang pdf is 

Ax)" te 
(n — 1)! 

0 x<0 


fo a,n) = os 


It can be shown that if the times between successive 
events are independent, each with an exponential dis- 
tribution with parameter A, then the total time X that 
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69. 


70. 


71. 


elapses before all of the next n events occur has pdf 

f(x; A, n). 

a. What is the expected value of X? If the time (in min- 
utes) between arrivals of successive customers is 
exponentially distributed with A = .5, how much 
time can be expected to elapse before the tenth cus- 
tomer arrives? 

b. If customer interarrival time is exponentially distrib- 
uted with A = .5, what is the probability that the 
tenth customer (after the one who has just arrived) 
will arrive within the next 30 min? 

c. The event {X = fr} occurs iff at least n events occur 
in the next ¢ units of time. Use the fact that the num- 
ber of events occurring in an interval of length t has 
a Poisson distribution with parameter At to write an 
expression (involving Poisson probabilities) for the 
Erlang cdf F(t; A, n) = P(X S 0). 


A system consists of five identical components con- 
nected in series as shown: 


As soon as one component fails, the entire system will 

fail. Suppose each component has a lifetime that is expo- 

nentially distributed with A = .01 and that components 

fail independently of one another. Define events A; = 

{ith component lasts at least t hours}, 7 = 1,...,5, so 

that the A;s are independent events. Let X = the time at 

which the system fails—that is, the shortest (minimum) 

lifetime among the five components. 

a. The event {X = t} is equivalent to what event involv- 
ing A,,...,A5? 

b. Using the independence of the A,’s, compute 
P(X = f). Then obtain F(t) = P(X S f) and the pdf of 
X. What type of distribution does X have? 

c. Suppose there are n components, each having expo- 
nential lifetime with parameter A. What type of dis- 
tribution does X have? 


If X has an exponential distribution with parameter A, 
derive a general expression for the (100p)th percentile 
of the distribution. Then specialize to obtain the 
median. 


a. The event {X?< y} is equivalent to what event involv- 
ing X itself? 

b. If X has a standard normal distribution, use part (a) 
to write the integral that equals P(X? S y). Then dif- 
ferentiate this with respect to y to obtain the pdf of 
X* [the square of a N(0, 1) variable]. Finally, show 
that X? has a chi-squared distribution with v = | df 
[see (4.10)]. [Hint: Use the following identity. ] 


a | (™ 
ral f) a| = flbQ)] + 6) — flaQ)] + 40) 
7» Waly) 
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4.5 Other Continuous Distributions 


The normal, gamma (including exponential), and uniform families of distributions 
provide a wide variety of probability models for continuous variables, but there are 
many practical situations in which no member of these families fits a set of observed 
data very well. Statisticians and other investigators have developed other families of 
distributions that are often appropriate in practice. 


The Weibull Distribution 


The family of Weibull distributions was introduced by the Swedish physicist 
Waloddi Weibull in 1939; his 1951 article ‘‘A Statistical Distribution Function 
of Wide Applicability” (J. of Applied Mechanics, vol. 18: 293-297) discusses a 
number of applications. 


DEFINITION A random variable X is said to have a Weibull distribution with shape param- 
eter a and scale parameter B (a > 0, B > 0) if the pdf of X is 


a 
— x2 le“ G/B y > 0) 

GACH CR |8)) = [8 (4.11) 
0 y= 0 


In some situations, there are theoretical justifications for the appropriateness 
of the Weibull distribution, but in many applications f(x; a, B) simply provides a 
good fit to observed data for particular values of a and B. When a@ = 1, the pdf 
reduces to the exponential distribution (with A = 1/8), so the exponential distribu- 
tion is a special case of both the gamma and Weibull distributions. However, there 
are gamma distributions that are not Weibull distributions and vice versa, so one 
family is not a subset of the other. Both a and 6 can be varied to obtain a number of 
different-looking density curves, as illustrated in Figure 4.28. 


a= 1, B= 1 (exponential) 


a=2,B=1 


a=2,B=.5 


Figure 4.28 Weibull density curves 
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Integrating to obtain E(X) and E(X’) yields 


evorfieg) @celi{re d)-[ete a] 


The computation of jz and o? thus necessitates using the gamma function. 
The integration [* f(y; a, B) dy is easily carried out to obtain the cdf of X. 


The cdf of a Weibull rv having parameters a and B is 


0 1) 
Gear, Bp) = ( = eo («/B)* REO (4.12) 


EXAMPLE 4.25 In recent years the Weibull distribution has been used to model engine emissions of 
various pollutants. Let X denote the amount of NO, emission (g/gal) from a randomly 
selected four-stroke engine of a certain type, and suppose that X has a Weibull distribu- 
tion with a = 2 and B = 10 (suggested by information in the article “Quantification 
of Variability and Uncertainty in Lawn and Garden Equipment NO, and Total 
Hydrocarbon Emission Factors,’ J. of the Air and Waste Management Assoc., 
2002: 435-448). The corresponding density curve looks exactly like the one in Fig- 
ure 4.28 for a = 2, B = 1 except that now the values 50 and 100 replace 5 and 10 on 
the horizontal axis. Then 


P(X < 10) = F(10; 2, 10) = 1 — e~ 9/19" = 1 — e-1 = 632 


Similarly, P(X = 25) = .998, so the distribution is almost entirely concentrated on 
values between 0 and 25. The value c which separates the 5% of all engines having 
the largest amounts of NO, emissions from the remaining 95% satisfies 


95 = 1 = e@/10? 


Isolating the exponential term on one side, taking logarithms, and solving the result- 
ing equation gives c ~ 17.3 as the 95th percentile of the emission distribution. & 


In practical situations, a Weibull model may be reasonable except that the 
smallest possible X value may be some value y not assumed to be zero (this would 
also apply to a gamma model; see Exercise 66). The quantity y can then be regarded 
as a third (threshold or location) parameter of the distribution, which is what Weibull 
did in his original work. For, say, y = 3, all curves in Figure 4.28 would be shifted 3 
units to the right. This is equivalent to saying that X — y has the pdf (4.11), so that 
the cdf of X is obtained by replacing x in (4.12) by x — y. 


EXAMPLE 4.26 An understanding of the volumetric properties of asphalt is important in designing 
mixtures which will result in high-durability pavement. The article “Is a Normal 
Distribution the Most Appropriate Statistical Distribution for Volumetric 
Properties in Asphalt Mixtures?” (J. of Testing and Evaluation, Sept. 2009: 
1-11) used the analysis of some sample data to recommend that for a particular 
mixture, X = air void volume (%) be modeled with a three-parameter Weibull distri- 
bution. Suppose the values of the parameters are y = 4, a = 1.3, and B = .8 (quite 
close to estimates given in the article). 

For x > 4, the cumulative distribution function is 


F(x; a, B, y) = F(x; 1.3, .8, 4) = 1 — ef 4/81" 
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The probability that the air void volume of a specimen is between 5% and 6% is 


P(5 =X = 6) = F(6; 1.3,.8,4) — F(5; 1.3, .8, 4) = e W-9/81% — e6-4/81" 
.263 — .037 = .226 


Figure 4.29 shows a graph from Minitab of the corresponding Weibull density func- 
tion in which the shaded area corresponds to the probability just calculated. 


fx) A 
0.9 5 
0.8 4 
0.7 4 
0.6 4 
0.5 4 
0.4 4 
0.3 4 
02- .226 
0.1 4 
0.0 x 
4 5 6 


Figure 4.29 Weibull density curve with threshold = 4, shape = 1.3, scale = .8 ie 


The Lognormal Distribution 


DEFINITION A nonnegative rv X is said to have a lognormal distribution if the rv 
Y = In(X) has a normal distribution. The resulting pdf of a lognormal rv when 
In(X) is normally distributed with parameters jz and o is 


1 
FG B, ©) = 4 V270x 


e7 Une) — P/(20*) x=0 


0 x<0 


Be careful here; the parameters ys and o are not the mean and standard deviation of 
X but of In(X). The mean and variance of X can be shown to be 


E(X) = ent a?/2 V(X) = euto?. (e” — 1) 


In Chapter 5, we will present a theoretical justification for this distribution in con- 
nection with the Central Limit Theorem. But as with other distributions, the lognor- 
mal can be used as a model even in the absence of such justification. Figure 4.30 
illustrates graphs of the lognormal pdf; although a normal curve is symmetric, a 
lognormal curve has a positive skew. 

Because In(X) has a normal distribution, the cdf of X can be expressed in terms 
of the cdf ®(z) of a standard normal rv Z. 


Fx; pb, 0) = P(X = x) = Plln(X) = InQ)] 


2 z= ae “) = o/ ace 4) 20 cia) 


oO 
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Figure 4.30 Lognormal density curves 


EXAMPLE 4.27 According to the article “Predictive Model for Pitting Corrosion in Buried Oil 
and Gas Pipelines” (Corrosion, 2009: 332-342), the lognormal distribution has 
been reported as the best option for describing the distribution of maximum pit depth 
data from cast iron pipes in soil. The authors suggest that a lognormal distribution 
with w = .353 and o = .754 is appropriate for maximum pit depth (mm) of buried 
pipelines. For this distribution, the mean value and variance of pit depth are 

E(X) = e353 + (.754)"/2 = ¢6373 = 1.29] 
V(X) = 626353 +0754" « (e6754" — 1) = (3.57697)(.765645) = 2.7387 


The probability that maximum pit depth is between | and 2 mm is 
POU =X = 2) = Pdnd) = In(X) = In(2)) = PO = In(X%) = .693) 


0 — 353 693 — 353 
= Pp)" < 7< "| = 6.47) — ®(—.45) = 354 
( 754 754 a 


This probability is illustrated in Figure 4.31 (from Minitab). 
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Figure 4.31 Lognormal density curve with 1 = .353 and 0 = .754 


What value c is such that only 1% of all specimens have a maximum pit depth 
exceeding c? The desired value satisfies 


( In(c) — *) 
99 =P(X<c)=P\Z< 


754 
The z critical value 2.33 captures an upper-tail area of .01 (zo; = 2.33), and thus a 
cumulative area of .99. This implies that 

In(c) — .353 _ 


Di 
.754 = 
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from which In(c) = 2.1098 and c = 8.247. Thus 8.247 is the 99th percentile of the 
maximum pit depth distribution. a 


The Beta Distribution 


All families of continuous distributions discussed so far except for the uniform dis- 
tribution have positive density over an infinite interval (though typically the density 
function decreases rapidly to zero beyond a few standard deviations from the mean). 
The beta distribution provides positive density only for X in an interval of finite length. 


DEFINITION A random variable X is said to have a beta distribution with parameters a, B 

(both positive), A, and B if the pdf of X is 
ey ING? ar (83) fe A NP 1 a 
f@a,B,A,B)=;,B—-A T(a)-T(@)\B-A Bil 
0 otherwise 


B-1 
Ax = 8 


The case A = 0, B = | gives the standard beta distribution. 


Figure 4.32 illustrates several standard beta pdf’s.Graphs of the general pdf are 
similar, except they are shifted and then stretched or compressed to fit over [A, B]. 
Unless a and f are integers, integration of the pdf to calculate probabilities is dif- 
ficult. Either a table of the incomplete beta function or appropriate software should 
be used. The mean and variance of X are 


ee ee a : (B — AYaB 
NS ae. Geist GD) 


A(x; a, B) 4 
5 4 


Figure 4.32 Standard beta density curves 


EXAMPLE 4.28 Project managers often use a method labeled PERT—for program evaluation and 
review technique—to coordinate the various activities making up a large project. 
(One successful application was in the construction of the Apollo spacecraft.) A 
standard assumption in PERT analysis is that the time necessary to complete any 
particular activity once it has been started has a beta distribution with A = the opti- 
mistic time (if everything goes well) and B = the pessimistic time (if everything 
goes badly). Suppose that in constructing a single-family house, the time X (in days) 
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necessary for laying the foundation has a beta distribution with A = 2, B = 5,a = 2, 
and B = 3. Then a/(a + B) = .4, so E(X) = 2 + (3)(.4) = 3.2. For these values of 
a and B, the pdf of X is a simple polynomial function. The probability that it takes 


at most 3 days to lay the foundation is 


3) At fx—-2\/5-x\? 

PXS I=) 3 tal 3 oe 
aK Hee ae) a 
Flo" oem, er a, ae 


The standard beta distribution is commonly used to model variation in the 
proportion or percentage of a quantity occurring in different samples, such as the 
proportion of a 24-hour day that an individual is asleep or the proportion of a certain 
element in a chemical compound. 


EXERCISES Section 4.5 (72-86) 


72. 


73. 


74. 


The lifetime X (in hundreds of hours) of a certain type of 
vacuum tube has a Weibull distribution with parameters 
a = 2 and B = 3. Compute the following: 

a. E(X) and V(X) 

b. P(X = 6) 

ce PU.5=xX=6) 


(This Weibull distribution is suggested as a model for 
time in service in “On the Assessment of Equipment 
Reliability: Trading Data Collection Costs for 
Precision,” J. of Engr. Manuf., 1991: 105-109.) 


before the individual becomes infectious. The article 
“The Probability of Containment for Multitype 
Branching Process Models for Emerging Epidemics” 
(J. of Applied Probability, 2011: 173-188) proposes a 
Weibull distribution with a = 2.2, B = 1.1, and y = .5 
(refer to Example 4.26). 

a. Calculate P(1 < X < 2). 

b. Calculate P(X > 1.5). 

c. What is the 90th percentile of the distribution? 

d. What are the mean and standard deviation of X? 


The authors of the article “A Probabilistic Insulation 75. Let X have a Weibull distribution with the pdf from 
Life Model for Combined Thermal-Electrical Expression (4.11). Verify that 2 = BIL + 1/a). [Hint: 
Stresses” (IEEE Trans. on Elect. Insulation, 1985: In the integral for E(X), make the change of variable 
519-522) state that “the Weibull distribution is widely y = (x/B)*, so that x = By'/*] 

used in statistical problems relating to aging of solid 76. Thearticle “The Statistics of Phytotoxic Air Pollutants” 


insulating materials subjected to aging and stress.” 

They propose the use of the distribution as a model for 

time (in hours) to failure of solid insulating specimens 

subjected to AC voltage. The values of the parameters 

depend on the voltage and temperature; suppose 

a = 2.5 and B = 200 (values suggested by data in the 

article). 

a. What is the probability that a specimen’s lifetime is 
at most 250? Less than 250? More than 300? 

b. What is the probability that a specimen’s lifetime is 
between 100 and 250? 

c. What value is such that exactly 50% of all specimens 
have lifetimes exceeding that value? 


Once an individual has been infected with a certain 
disease, let X represent the time (days) that elapses 
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77. 


(J. of Royal Stat. Soc., 1989: 183-198) suggests the 
lognormal distribution as a model for SO, concentration 
above a certain forest. Suppose the parameter values are 
w= 19ando=.9. 


a. What are the mean value and standard deviation of 


concentration? 


b. What is the probability that concentration is at most 


10? Between 5 and 10? 


The authors of the article from which the data in Exercise 

1.27 was extracted suggested that a reasonable probability 

model for drill lifetime was a lognormal distribution with 

pw =45 anda =.8. 

a. What are the mean value and standard deviation of 
lifetime? 

b. What is the probability that lifetime is at most 100? 


78. 


79. 


80. 


81. 


c. What is the probability that lifetime is at least 200? 
Greater than 200? 


The article “On Assessing the Accuracy of Offshore 

Wind Turbine Reliability-Based Design Loads from the 

Environmental Contour Method” (ntl. J. of Offshore 

and Polar Engr., 2005: 132-140) proposes the Weibull 

distribution with a = 1.817 and B = .863 as a model for 

1-hour significant wave height (m) at a certain site. 

a. What is the probability that wave height is at most 
5m? 

b. What is the probability that wave height exceeds its 
mean value by more than one standard deviation? 

c. What is the median of the wave-height distribution? 

d. For 0<p<1, give a general expression for the 
100pth percentile of the wave-height distribution. 


Nonpoint source loads are chemical masses that travel to 
the main stem of a river and its tributaries in flows that 
are distributed over relatively long stream reaches, in 
contrast to those that enter at well-defined and regulated 
points. The article “Assessing Uncertainty in Mass 
Balance Calculation of River Nonpoint Source 
Loads” (J. of Envir. Engr., 2008: 247-258) suggested 
that for a certain time period and location, X = nonpoint 
source load of total dissolved solids could be modeled 
with a lognormal distribution having mean value 10,281 
kg/day/km and a coefficient of variation CV = .40 (CV = 
O»/pLy)- 
a. What are the mean value and standard deviation of 
In(X)? 
b. What is the probability that X is at most 15,000 
kg/day/km? 
c. What is the probability that X exceeds its mean 
value, and why is this probability not .5? 


d. Is 17,000 the 95th percentile of the distribution? 


Use Equation (4.13) to write a formula for the 
median «2 of the lognormal distribution. What is the 
median for the load distribution of Exercise 79? 

b. Recalling that z, is our notation for the 100(1 — a) 
percentile of the standard normal distribution, write 
an expression for the 100(1 — a) percentile of the 
lognormal distribution. In Exercise 79, what value 
will load exceed only 1% of the time? 


Sales delay is the elapsed time between the manufacture 
of a product and its sale. According to the article 
“Warranty Claims Data Analysis Considering Sales 
Delay” (Quality and Reliability Engr. Intl., 2013: 
113-123), it is quite common for investigators to model 
sales delay using a lognormal distribution. For a particu- 
lar product, the cited article proposes this distribution 
with parameter values wp = 2.05 and o? = .06 (here the 
unit for delay is months). 

a. What are the variance and standard deviation of 

delay time? 


82. 


ao np 
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b. What is the probability that delay time exceeds 
12 months? 

c. What is the probability that delay time is within one 
standard deviation of its mean value? 

d. What is the median of the delay time distribution? 

e. What is the 99th percentile of the delay time 
distribution? 

f. Among 10 randomly selected such items, how many 
would you expect to have a delay time exceeding 
8 months? 


As in the case of the Weibull and Gamma distributions, 
the lognormal distribution can be modified by the intro- 
duction of a third parameter y such that the pdf is shifted 
to be positive only for x > y. The article cited in 
Exercise 4.39 suggested that a shifted lognormal distri- 
bution with shift (i.e., threshold) = 1.0, mean value = 
2.16, and standard deviation = 1.03 would be an appro- 
priate model for the rv X = maximum-to-average depth 
ratio of a corrosion defect in pressurized steel. 
a. What are the values of and o for the proposed 
distribution? 
b. What is the probability that depth ratio exceeds 2? 
c. What is the median of the depth ratio distribution? 
d. What is the 99th percentile of the depth ratio 
distribution? 


83. What condition on a and f is necessary for the standard 


beta pdf to be symmetric? 


84. Suppose the proportion X of surface area in a randomly 


selected quadrat that is covered by a certain plant has a 
standard beta distribution with a = 5 and B = 2. 
Compute E(X) and V(X). 

Compute P(X = .2). 

Compute P(.2 = X = .4). 

What is the expected proportion of the sampling region 
not covered by the plant? 


85. Let X have a standard beta density with parameters a 


and B. 


a. Verify the formula for E(X) given in the section. 
b. Compute E[(1 — X)’"]. If X represents the proportion of 


a substance consisting of a particular ingredient, what 
is the expected proportion that does not consist of this 
ingredient? 


86. Stress is applied to a 20-in. steel bar that is clamped in a 


fixed position at each end. Let Y = the distance from the 
left end at which the bar snaps. Suppose Y/20 has a stan- 
dard beta distribution with E(Y) = 10 and V(Y) = eae 


a. What are the parameters of the relevant standard beta 


distribution? 
Compute P(8 = Y = 12). 


c. Compute the probability that the bar snaps more than 


2 in. from where you expect it to. 
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4.6 Probability Plots 


An investigator will often have obtained a numerical sample x,, x,,...,.x,, and wish 
to know whether it is plausible that it came from a population distribution of some 
particular type (e.g., from a normal distribution). For one thing, many formal pro- 
cedures from statistical inference are based on the assumption that the population 
distribution is of a specified type. The use of such a procedure is inappropriate 
if the actual underlying probability distribution differs greatly from the assumed 
type. For example, the article “Toothpaste Detergents: A Potential Source of 
Oral Soft Tissue Damage” (Intl. J. of Dental Hygiene, 2008: 193-198) contains 
the following statement: “Because the sample number for each experiment (rep- 
lication) was limited to three wells per treatment type, the data were assumed to 
be normally distributed.” As justification for this leap of faith, the authors wrote 
that “Descriptive statistics showed standard deviations that suggested a normal 
distribution to be highly likely.” Note: This argument is not very persuasive. 

Additionally, understanding the underlying distribution can sometimes give 
insight into the physical mechanisms involved in generating the data. An effective 
way to check a distributional assumption is to construct what is called a probability 
plot. The essence of such a plot is that if the distribution on which the plot is based is 
correct, the points in the plot should fall close to a straight line. If the actual distribu- 
tion is quite different from the one used to construct the plot, the points will likely 
depart substantially from a linear pattern. 


Sample Percentiles 


The details involved in constructing probability plots differ a bit from source to source. 
The basis for our construction is a comparison between percentiles of the sample data 
and the corresponding percentiles of the distribution under consideration. Recall that 
the (100p)th percentile of a continuous distribution with cdf F( - ) is the number 7(p) 
that satisfies F(1(p)) = p. That is, 7(p) is the number on the measurement scale such 
that the area under the density curve to the left of n(p) is p. Thus the 50th percentile 
7(.5) satisfies F(7(.5)) = .5, and the 90th percentile satisfies F(n(.9)) = .9. Consider 
as an example the standard normal distribution, for which we have denoted the cdf 
by ®( - ). From Appendix Table A.3, we find the 20th percentile by locating the row 
and column in which .2000 (or a number as close to it as possible) appears inside the 
table. Since .2005 appears at the intersection of the —.8 row and the .04 column, the 
20th percentile is approximately —.84. Similarly, the 25th percentile of the standard 
normal distribution is (using linear interpolation) approximately —.675. 

Roughly speaking, sample percentiles are defined in the same way that percen- 
tiles of a population distribution are defined. The 50th-sample percentile should sepa- 
rate the smallest 50% of the sample from the largest 50%, the 90th percentile should 
be such that 90% of the sample lies below that value and 10% lies above, and so on. 
Unfortunately, we run into problems when we actually try to compute the sample per- 
centiles for a particular sample of n observations. If, for example, n = 10, we can split 
off 20% of these values or 30% of the data, but there is no value that will split off exactly 
23% of these ten observations. To proceed further, we need an operational definition 
of sample percentiles (this is one place where different people do slightly different 
things). Recall that when n is odd, the sample median or 50th-sample percentile is the 
middle value in the ordered list, for example, the sixth-largest value when n = 11. This 
amounts to regarding the middle observation as being half in the lower half of the data 
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and half in the upper half. Similarly, suppose n = 10. Then if we call the third-smallest 
value the 25th percentile, we are regarding that value as being half in the lower group 
(consisting of the two smallest observations) and half in the upper group (the seven larg- 
est observations). This leads to the following general definition of sample percentiles. 


DEFINITION Order the n sample observations from smallest to largest. Then the ith smallest 
observation in the list is taken to be the [100(@ — .5)/n] th sample percentile. 


Once the percentage values 100(@i — .5)/n (i = 1, 2,...,) have been calcu- 
lated, sample percentiles corresponding to intermediate percentages can be obtained 
by linear interpolation. For example, if n = 10, the percentages corresponding to 
the ordered sample observations are 100(1 — .5)/10 = 5%, 100(2 — .5)/10 = 15%, 
25%,..., and 100(10 — .5)/10 = 95%. The 10th percentile is then halfway between 
the 5th percentile (smallest sample observation) and the 15th percentile (second- 
smallest observation). For our purposes, such interpolation is not necessary because 
a probability plot will be based only on the percentages 100(7 — .5)/n corresponding 
to the n sample observations. 


A Probability Plot 


Suppose now that for percentages 100(i — .5)/n (i = 1,...,) the percentiles are 
determined for a specified population distribution whose plausibility is being 
investigated. If the sample was actually selected from the specified distribution, 
the sample percentiles (ordered sample observations) should be reasonably close 
to the corresponding population distribution percentiles. That is, for 7 = 1, 2,...,” 
there should be reasonable agreement between the ith smallest sample observation 
and the [100( — .5)/n]th percentile for the specified distribution. Let’s consider the 
(population percentile, sample percentile) pairs—that is, the pairs 


[100(i — .5)/n]th percentile —_ ith smallest sample 
of the distribution, observation 


for i = 1,...,n. Each such pair can be plotted as a point on a two-dimensional 
coordinate system. If the sample percentiles are close to the corresponding popula- 
tion distribution percentiles, the first number in each pair will be roughly equal to 
the second number. The plotted points will then fall close to a 45° line. Substantial 
deviations of the plotted points from a 45° line cast doubt on the assumption that the 
distribution under consideration is the correct one. 


Example 4.29 The value of a certain physical constant is known to an experimenter. The experi- 
menter makes n = 10 independent measurements of this value using a particular 
measurement device and records the resulting measurement errors (error = observed 
value — true value). These observations appear in the accompanying table. 


Percentage 5 15 25 35 45 
z percentile — 1.645 — 1.037 —.675 —.385 =126 
Sample observation 9 1.25 =./9 253 .20 
Percentage 55 65 75 85 95 
z percentile 126 385 .675 1.037 1.645 
Sample observation 35 72 .87 1.40 1.56 
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Is it plausible that the random variable measurement error has a standard normal dis- 
tribution? The needed standard normal (z) percentiles are also displayed in the table. 
Thus the points in the probability plot are (— 1.645, —1.91), (— 1.037, —1.25),..., 
and (1.645, 1.56). Figure 4.33 shows the resulting plot. Although the points deviate 
a bit from the 45° line, the predominant impression is that this line fits the points 
very well. The plot suggests that the standard normal distribution is a reasonable 
probability model for measurement error. 
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Figure 4.33 Plot of pairs (z percentile, observed value) for the data of Example 4.29 


Figure 4.34 shows a plot of pairs (z percentile, observation) for a second sample 
of ten observations. The 45° line gives a good fit to the middle part of the sample but 
not to the extremes. The plot has a well-defined S-shaped appearance. The two small- 
est sample observations are considerably larger than the corresponding z percentiles 
(the points on the far left of the plot are well above the 45° line). Similarly, the two 
largest sample observations are much smaller than the associated z percentiles. This 
plot indicates that the standard normal distribution would not be a plausible choice 
for the probability model that gave rise to these observed measurement errors. 


45° line 


S-shaped curve 


1 percentile 
12. 1:6 


Figure 4.34 Plots of pairs (z percentile, observed value) for the scenario of Example 
4.29: second sample eal 
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An investigator is typically not interested in knowing just whether a particu- 
lar probability distribution, such as the standard normal distribution (normal with 
jw = Oando = 1) or the exponential distribution with A = .1, is a plausible model 
for the population distribution from which the sample was selected. Instead, the 
issue is whether some member of a family of probability distributions specifies 
a plausible model—the family of normal distributions, the family of exponential 
distributions, the family of Weibull distributions, and so on. The values of the 
parameters of a distribution are usually not specified at the outset. If the family 
of Weibull distributions is under consideration as a model for lifetime data, are 
there any values of the parameters a and 6 for which the corresponding Weibull 
distribution gives a good fit to the data? Fortunately, it is frequently the case that 
just one probability plot will suffice for assessing the plausibility of an entire 
family. If the plot deviates substantially from a straight line, no member of the 
family is plausible. When the plot is quite straight, further work is necessary to 
estimate values of the parameters that yield the most reasonable distribution of 
the specified type. 

Let’s focus on a plot for checking normality. Such a plot is useful in applied 
work because many formal statistical procedures give accurate inferences only when 
the population distribution is at least approximately normal. These procedures should 
generally not be used if the normal probability plot shows a very pronounced depar- 
ture from linearity. The key to constructing an omnibus normal probability plot is the 
relationship between standard normal (z) percentiles and those for any other normal 
distribution: 


normal (ww, 7) percentile = 2 + o - (corresponding z percentile) 


Consider first the case w= 0. If each observation is exactly equal to the 
corresponding normal percentile for some value of o, the pairs (0 - [z percen- 
tile], observation) fall on a 45° line, which has slope 1. This then implies that the 
(z percentile, observation) pairs fall on a line passing through (0, 0) (i.e., one with 
y-intercept 0) but having slope o rather than 1. The effect of a nonzero value of 
pis simply to change the y-intercept from 0 to pw. 


A plot of the n pairs 
({100@ — .5)/n]th z percentile, ith smallest observation) 


is called a normal probability plot. If the sample observations are in fact 
drawn from a normal distribution with mean value p and standard deviation 
o, the points should fall close to a straight line with slope o and intercept p. 
Thus a plot for which the points fall close to some straight line suggests that 
the assumption of a normal population distribution is plausible. 


EXAMPLE 4.30 There has been recent increased use of augered cast-in-place (ACIP) and drilled dis- 
placement (DD) piles in the foundations of buildings and transportation structures. 
In the article “Design Methodology for Axially Loaded Auger Cast-in-Place and 
Drilled Displacement Piles” (J. of Geotech. Geoenviron. Engr., 2012: 1431-1441), 
researchers propose a design methodology to enhance the efficiency of these piles. 
Here are length-diameter ratio measurements based on 17 static pile load tests on 
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ACIP and DD piles from various construction sites. The values of p for which z per- 
centiles are needed are (1 — .5)/17 = .029, (2 — .5)/17 = .088, ... and .971. 


Xw: 30.86 37.68 39.04 42.78 42.89 42.89 45.05 47.08 47.08 
z percentile: 1.89 -1.35 -1.05 -0.82 -0.63 -046 -0.30 -0.15 0.00 
Xw: 48.79 48.79 52.56 52.56 54.80 55.17 56.31 59.94 
z percentile: 0.15 0.30 0.46 0.63 0.82 1.05 1.35 1.89 


Figure 4.35 shows the corresponding normal probability plot generated by the R 
software package. The pattern in the plot is quite straight, indicating it is plausible 
that the population distribution of length-diameter ratio is normal. 


Normal Quantile Plot for Length-Diameter Ratio 


6o 
. 


55 
n 

. 
. 


ub 


45 


2 Percentile 


Figure 4.35 Normal probability plot from R for the Length-Diameter Ratio data a 


There is an alternative version of a normal probability plot in which the z percen- 
tile axis is replaced by a nonlinear percentage axis. The scaling on this axis is constructed 
so that plotted points should again fall close to a line when the sampled distribution is 
normal. Figure 4.36 shows such a plot from Minitab for the ratio data of Example 4.30. 
(The last two numbers in the small box on the right will be explained in Chapter 14.) 
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Figure 4.36 Normal probability plot of the ratio data from Minitab 
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A nonnormal population distribution can often be placed in one of the follow- 
ing three categories: 


1. It is symmetric and has “lighter tails” than does a normal distribution; that is, the 
density curve declines more rapidly out in the tails than does a normal curve. 


2. It is symmetric and heavy-tailed compared to a normal distribution. 
3. It is skewed. 


A uniform distribution is light-tailed, since its density function drops to zero outside 
a finite interval. The Cauchy density function f(x) = 1/[7B(1 + ((x« — 6)/B)*)] for 
—oo <x < % is heavy-tailed, since 1/(1 + x) declines much less rapidly than does 
e~*/?, Lognormal and Weibull distributions are among those that are skewed. When 
the points in a normal probability plot do not adhere to a straight line, the pattern 
will frequently suggest that the population distribution is in a particular one of these 
three categories. 

The largest and smallest observations in a sample from a light-tailed distribu- 
tion are usually not as extreme as would be expected from a normal random sample. 
Visualize a straight line drawn through the middle part of the plot; points on the far 
right tend to be below the line (observed value < z percentile), whereas points on the 
left end of the plot tend to fall above the straight line (observed value > z percentile). 
The result is an S-shaped pattern of the type pictured in Figure 4.34. 

A sample from a heavy-tailed distribution also tends to produce an S-shaped 
plot. However, in contrast to the light-tailed case, the left end of the plot curves 
downward (observed < z percentile), as shown in Figure 4.37(a). If the underlying 
distribution is positively skewed (a short left tail and a long right tail), the smallest 
sample observations will be larger than expected from a normal sample and so will 
the largest observations. In this case, points on both ends of the plot will fall above 
a straight line through the middle part, yielding a curved pattern, as illustrated in 
Figure 4.37(b). A sample from a lognormal distribution will usually produce such 
a pattern. A plot of (z percentile, In(x)) pairs should then resemble a straight line. 
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Figure 4.37 Probability plots that suggest a nonnormal distribution: (a) a plot consistent with a heavy-tailed 
distribution; (b) a plot consistent with a positively skewed distribution 


Even when the population distribution is normal, the sample percentiles will 
not coincide exactly with the theoretical percentiles because of sampling variability. 
How much can the points in the probability plot deviate from a straight-line pattern 
before the assumption of population normality is no longer plausible? This is not an 
easy question to answer. Generally speaking, a small sample from a normal distribu- 
tion is more likely to yield a plot with a nonlinear pattern than is a large sample. The 
book Fitting Equations to Data (see the Chapter 13 bibliography) presents the results 
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of a simulation study in which numerous samples of different sizes were selected 
from normal distributions. The authors concluded that there is typically greater vari- 
ation in the appearance of the probability plot for sample sizes smaller than 30, and 
only for much larger sample sizes does a linear pattern generally predominate. When 
a plot is based on a small sample size, only a very substantial departure from linearity 
should be taken as conclusive evidence of nonnormality. A similar comment applies 
to probability plots for checking the plausibility of other types of distributions. 


Beyond Normality 


Consider a family of probability distributions involving two parameters, 6, and 6,, 
and let F(x; 6,, 8) denote the corresponding cdf’s. The family of normal distribu- 
tions is one such family, with 6, = mw, 0, =o, and F(x; pw, 0) = P[(x — p)/o]. 
Another example is the Weibull family, with 0, = a, 0, = 6, and 


F(x; a, B) = 1 — e#/8" 


Still another family of this type is the gamma family, for which the cdf is an integral 
involving the incomplete gamma function that cannot be expressed in any simpler form. 

The parameters 6, and 6, are said to be location and scale parameters, respec- 
tively, if F(x; 6,,0,) is a function of (x — 6,)/05. The parameters yz and o of the 
normal family are location and scale parameters, respectively. In general, changing 0, 
shifts the location of the corresponding density curve to the right or left, and changing 
0, amounts to stretching or compressing the horizontal measurement scale. Another 
example is given by the cdf 


F(x; 8,, 0.) = 1 — ee mw <x <0 


A random variable with this cdf is said to have an extreme value distribution. It is 
used in applications involving component lifetime and material strength. 

Although the form of the extreme value cdf might at first glance suggest that 6, 
is the point of symmetry for the density function, and therefore the mean and median, 
this is not the case. Instead, P(X = 0,) = F(6,; 6,, 0,) = 1 — e"' = .632, and the 
density function f(x; 0), 6) = F(x; 0), 6,) is negatively skewed (a long lower tail). 
Similarly, the scale parameter 0, is not the standard deviation (u = 6, — .57720, and 
o = 1.2836,). However, changing the value of 6, does rigidly shift the density curve 
to the left or right, whereas a change in 6, rescales the measurement axis. 

The parameter 6 of the Weibull distribution is a scale parameter, but a is not 
a location parameter. A similar comment applies to the parameters a and B of the 
gamma distribution. And for the lognormal distribution, ju is not a location param- 
eter, nor is 0 a scale parameter. In the usual form, the density function for any mem- 
ber of these families is positive for x > 0 and 0 otherwise. Examples and exercises 
in the two previous sections introduced a third location (i.e., threshold) parameter 
y for these three distributions; this shifts the density function so that it is positive if 
x > y and zero otherwise. 

When the family under consideration has only location and scale parameters, 
the issue of whether any member of the family is a plausible population distribution 
can be addressed via a single, easily constructed probability plot. One first obtains 
the percentiles of the standard distribution, the one with 0, = 0 and 6, = 1, for 
percentages 100(i — .5)/n (i = 1,..., n). The n (standardized percentile, observation) 
pairs give the points in the plot. This is exactly what we did to obtain an omnibus 
normal probability plot. Somewhat surprisingly, this methodology can be applied to 
yield an omnibus Weibull probability plot. The key result is that if X has a Weibull 
distribution with shape parameter a and scale parameter B, then the transformed 
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variable In(X) has an extreme value distribution with location parameter 6, = In(6) 
and scale parameter 1/a. Thus a plot of the (extreme value standardized percen- 
tile, In(x)) pairs showing a strong linear pattern provides support for choosing the 
Weibull distribution as a population model. 


EXAMPLE 4.31 The accompanying observations are on lifetime (in hours) of power apparatus insula- 
tion when thermal and electrical stress acceleration were fixed at particular values 
(“On the Estimation of Life of Power Apparatus Insulation Under Combined 
Electrical and Thermal Stress,’ IEEE Trans. on Electrical Insulation, 1985: 
70-78). A Weibull probability plot necessitates first computing the Sth, 15th, ..., 
and 95th percentiles of the standard extreme value distribution. The (100p)th per- 
centile 7(p) satisfies 


p=fao)=1-e" 
from which n(p) = In[—In(1 — p)]. 


Percentile =297 —1.82 =1:25 —.84 —51 
x 282 501 741 851 1072 
In(x) 5.64 6.22 6.61 6.75 6.98 
Percentile = 23 05 33 .64 1.10 
x 1122 1202 1585 1905 2138 
In(x) 7.02 7.09 7.37 7.55 7.67 


The pairs (—2.97, 5.64), (—1.82, 6.22),..., (1.10, 7.67) are plotted as points in 
Figure 4.38. The straightness of the plot argues strongly for using the Weibull dis- 
tribution as a model for insulation life, a conclusion also reached by the author of 
the cited article. 
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Figure 4.38 A Weibull probability plot of the insulation lifetime data & 


The gamma distribution is an example of a family involving a shape param- 
eter for which there is no transformation h( - ) such that h(X) has a distribution that 
depends only on location and scale parameters. Construction of a probability plot 
necessitates first estimating the shape parameter from sample data (some methods 
for doing this are described in Chapter 6). Sometimes an investigator wishes to 
know whether the transformed variable X° has a normal distribution for some value 
of 6 (by convention, 0 = 0 is identified with the logarithmic transformation, in 
which case X has a lognormal distribution). The book Graphical Methods for Data 
Analysis, listed in the Chapter 1 bibliography, discusses this type of problem as well 
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as other refinements of probability plotting. Fortunately, the wide availability of 
various probability plots with statistical software packages means that the user can 
often sidestep technical details. 


EXERCISES Section 4.6 (87-97) 


87. The accompanying normal probability plot was con- 
structed from a sample of 30 readings on tension for 
mesh screens behind the surface of video display tubes 
used in computer monitors. Does it appear plausible that 
the tension distribution is normal? 
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88. A sample of 15 female collegiate golfers was selected 
and the clubhead velocity (km/hr) while swinging a 
driver was determined for each one, resulting in the fol- 
lowing data (“Hip Rotational Velocities During the 
Full Golf Swing,” J. of Sports Science and Medicine, 
2009: 296-299): 


69.0 69.7 72.7 80.3 81.0 

85.0 86.0 86.3 86.7 87.7 

89.3, 90.7 91.0 92.5 93.0 

The corresponding z percentiles are 

—1.83 —1.28 0.97 0.73 0.52 

—0.34 —0.17 0.0 0.17 0.34 
0.52 0.73 0.97 1.28 1.83 


Construct a normal probability plot and a dotplot. Is it 
plausible that the population distribution is normal? 


89. The accompanying sample consisting of n = 20 observa- 
tions on dielectric breakdown voltage of a piece of epoxy 
resin appeared in the article “Maximum Likelihood 
Estimation in the 3-Parameter Weibull Distribution 
(EEE Trans. on Dielectrics and Elec. Insul., 1996: 
43-55). The values of (i — .5)/n for which z percentiles 
are needed are (1 — .5)/20 = .025, (2 — .5)/20 = 
.075,..., and .975. Would you feel comfortable estimat- 
ing population mean voltage using a method that assumed 
a normal population distribution? 
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Observation 24.46 25.61 26.25 26.42 26.66 
z percentile 1.96 —144 —1.15 93 .76 
Observation 2715 27.31 27.54 27.74 27.94 
z percentile .60 AS 32 19 .06 
Observation 27.98 28.04 28.28 28.49 28.50 
z percentile .06 19 32 45 .60 
Observation 28.87 29.11 29.13 29.50 30.88 
z percentile .76 93 115 144 1.96 


90. The article “A Probabilistic Model of Fracture in 
Concrete and Size Effects on Fracture Toughness” 
(Magazine of Concrete Res., 1996: 311-320) gives 
arguments for why fracture toughness in concrete speci- 
mens should have a Weibull distribution and presents 
several histograms of data that appear well fit by super- 
imposed Weibull curves. Consider the following sample 
of size n= 18 observations on toughness for high- 
strength concrete (consistent with one of the histo- 
grams); values of p; = (i — .5)/18 are also given. 


Observation A7 58 .65 .69 72 74 


D; .0278 .0833 .1389 .1944 .2500 .3056 
Observation 77 79 80 81 .82 84 
D; 3611 4167 .4722 .5278 5833 =.6389 
Observation 86 89 91 95 1.01 1.04 
D; 6944 .7500 .8056 .8611 .9167 .9722 


Construct a Weibull probability plot and comment. 


91. Construct a normal probability plot for the fatigue-crack 
propagation data given in Exercise 39 (Chapter 1). Does 
it appear plausible that propagation life has a normal 
distribution? Explain. 


92. The article “The Load-Life Relationship for M50 
Bearings with Silicon Nitride Ceramic Balls” 
(Lubrication Engr., 1984: 153-159) reports the accom- 
panying data on bearing load life (million revs.) for 
bearings tested at a 6.45 KN load. 


47.1 68.1 68.1 90.8 103.6 106.0 115.0 
126.0 146.6 229.0 240.0 240.0 278.0 278.0 
289.0 289.0 367.0 385.9 392.0 505.0 


a. Construct a normal probability plot. Is normality 
plausible? 

b. Construct a Weibull probability plot. Is the Weibull 
distribution family plausible? 


93. 


94. 


95. 


96. 


Construct a probability plot that will allow you to assess 
the plausibility of the lognormal distribution as a model 
for the rainfall data of Exercise 83 in Chapter 1. 


The accompanying observations are precipitation values 
during March over a 30-year period in Minneapolis-St. 
Paul. 


ef 1.20 3.00 162 2.81 2.48 
1.74 A7  -3.09 1.31 1.87 .96 
81 143 151 32 1.18 1.89 
1.20 3.37 2.10 59 1.35 .90 
1.95 2.20 52 81 4.75 2.05 


a. Construct and interpret a normal probability plot for 
this data set. 

b. Calculate the square root of each value and then 
construct a normal probability plot based on this 
transformed data. Does it seem plausible that the 
square root of precipitation is normally distributed? 

c. Repeat part (b) after transforming by cube roots. 

Use a statistical software package to construct a normal 


probability plot of the tensile ultimate-strength data 
given in Exercise 13 of Chapter 1, and comment. 


Let the ordered sample observations be denoted by 
V1, Yo +++» ¥, (, being the smallest and y, the largest). Our 


97. 
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suggested check for normality is to plot the 
(@~!(i — .5)/n), y,) pairs. Suppose we believe that the 
observations come from a distribution with mean 0, and let 
W,.--,W, be the ordered absolute values of the x;'s. 
A half-normal plot is a probability plot of the w;’s. More spe- 
cifically, since P(\Z| Sw) = P(-w =Z<=w)= 
20(w) — 1, a half-normal plot is a plot of the 
(@7!/{[G — .5)/n + 1]/2}, w,) pairs. The virtue of this 
plot is that small or large outliers in the original sample will 
now appear only at the upper end of the plot rather than at both 
ends. Construct a half-normal plot for the following sample of 
measurement errors, and comment: —3.78, —1.27, 1.44, 
—.39, 12.38, —43.40, 1.15, —3.96, —2.34, 30.84. 


The following failure time observations (1000s of hours) 
resulted from accelerated life testing of 16 integrated 
circuit chips of a certain type: 


82.8 11.6 359.5 502.5 307.8 179.7 
242.0 26.5 244.8 304.3 379.1 212.6 
229.9 558.9 366.7 204.6 


Use the corresponding percentiles of the exponential 
distribution with A = 1 to construct a probability plot. 
Then explain why the plot assesses the plausibility of 
the sample having been generated from any exponential 
distribution. 


SUPPLEMENTARY EXERCISES (98-128) 


98. 


99. 


Let X = the time it takes a read/write head to locate a 
desired record on a computer disk memory device once 
the head has been positioned over the correct track. If 
the disks rotate once every 25 millisec, a reasonable 
assumption is that X is uniformly distributed on the 
interval [0, 25]. 

a. Compute P(10 = X = 20). 

b. Compute P(X = 10). 

c. Obtain the cdf F(X). 

d. Compute E(X) and oy. 


A 12-in. bar that is clamped at both ends is to be sub- 
jected to an increasing amount of stress until it snaps. Let 
Y = the distance from the left end at which the break 
occurs. Suppose Y has pdf 


J |i) Case 
fo) =4\24PV0 2} > >= 


0 otherwise 


Compute the following: 

a. The cdf of Y, and graph it. 

b. P(Y <4), P(Y > 6), and P(4 = Y= 6) 

c. E(Y), E(¥*), and V(Y) 

d. The probability that the break point occurs more than 
2 in. from the expected break point. 


100. 


101. 


e. The expected length of the shorter segment when the 
break occurs. 


Let X denote the time to failure (in years) of a certain 

hydraulic component. Suppose the pdf of X is f(x) = 

32/(x + 4) for x < 0. 

a. Verify that f(x) is a legitimate pdf. 

b. Determine the cdf. 

c. Use the result of part (b) to calculate the probability 
that time to failure is between 2 and 5 years. 
What is the expected time to failure? 

e. If the component has a salvage value equal to 
100/(4 + x) when its time to failure is x, what is the 
expected salvage value? 


The completion time X for a certain task has cdf F(x) 
given by 


0 x<0 

3 

— Osx<1l 

3 x 
1/7 1. 3 7 

eae (tenia SS ee Le eS 

2\3 4 4 3 

1 x>> 
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102. 


103. 


104. 


105. 


106. 


a. Obtain the pdf f(x) and sketch its graph. 
b. Compute P(.S = X = 2). 
c. Compute E(X). 


Let X represent the number of individuals who respond 
to a particular online coupon offer. Suppose that X has 
approximately a Weibull distribution with a = 10 and 
B = 20. Calculate the best possible approximation to 
the probability that X is between 15 and 20, inclusive. 


The article “Computer Assisted Net Weight Control” 

(Quality Progress, 1983: 22-25) suggests a normal dis- 

tribution with mean 137.2 oz and standard deviation 

1.6 oz for the actual contents of jars of a certain type. The 

stated contents was 135 oz. 

a. What is the probability that a single jar contains 
more than the stated contents? 

b. Among ten randomly selected jars, what is the prob- 
ability that at least eight contain more than the stated 
contents? 

c. Assuming that the mean remains at 137.2, to what 
value would the standard deviation have to be changed 
so that 95% of all jars contain more than the stated 
contents? 


When circuit boards used in the manufacture of compact 

disc players are tested, the long-run percentage of defec- 

tives is 5%. Suppose that a batch of 250 boards has been 

received and that the condition of any particular board is 

independent of that of any other board. 

a. What is the approximate probability that at least 10% 
of the boards in the batch are defective? 

b. What is the approximate probability that there are 
exactly 10 defectives in the batch? 


Exercise 38 introduced two machines that produce wine 
corks, the first one having a normal diameter distribution 
with mean value 3 cm and standard deviation .1 cm, and 
the second having a normal diameter distribution with 
mean value 3.04 cm and standard deviation .02 cm. 
Acceptable corks have diameters between 2.9 and 3.1 cm. 
If 60% of all corks used come from the first machine and 
a randomly selected cork is found to be acceptable, what 
is the probability that it was produced by the first 
machine? 


The reaction time (in seconds) to a certain stimulus is a 
continuous random variable with pdf 


lsxs3 


ee 
f@=42 x 


0 otherwise 


a. Obtain the cdf. 

b. What is the probability that reaction time is at most 
2.5 sec? Between 1.5 and 2.5 sec? 

c. Compute the expected reaction time. 

d. Compute the standard deviation of reaction time. 

e. If an individual takes more than 1.5 sec to react, a light 
comes on and stays on either until one further second 
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107. 


108. 


109. 


has elapsed or until the person reacts (whichever hap- 
pens first). Determine the expected amount of time 
that the light remains lit. [Hint: Let h(X) = the time 
that the light is on as a function of reaction time X.] 


Let X denote the temperature at which a certain chemical 
reaction takes place. Suppose that X has pdf 


i 2 
fQ) = so =15%7=2 


0) otherwise 


a. Sketch the graph of f(x). 
Determine the cdf and sketch it. 

c. Is 0 the median temperature at which the reaction 
takes place? If not, is the median temperature smaller 
or larger than 0? 

d. Suppose this reaction is independently carried out 
once in each of ten different labs and that the pdf of 
reaction time in each lab is as given. Let Y = the 
number among the ten labs at which the temperature 
exceeds 1. What kind of distribution does Y have? 
(Give the names and values of any parameters.) 


An oocyte is a female germ cell involved in reproduc- 
tion. Based on analyses of a large sample, the article 
“Reproductive Traits of Pioneer Gastropod Species 
Colonizing Deep-Sea Hydrothermal Vents After an 
Eruption” (Marine Biology, 2011: 181-192) proposed 
the following mixture of normal distributions as a model 
for the distribution of X = oocyte diameter (um): 


F(X) = pfs wy, 0) + 1 — p) fs bo, a) 


where f, and f, are normal pdfs. Suggested parameter 

values were p = .35, ww, = 4.4, wy = 5.0, and o = .27. 

a. What is the 
diameter? 


expected (i.e. mean) value of oocyte 


b. What is the probability that oocyte diameter is 
between 4.4 wm and 5.0 wm? [Hint: Write an 
expression for the corresponding integral, carry the 
integral operation through to the two components, 
and then use the fact that each component is a nor- 
mal pdf.] 

c. What is the probability that oocyte diameter is 
smaller than its mean value? What does this imply 
about the shape of the density curve? 


The article “The Prediction of Corrosion by Statistical 
Analysis of Corrosion Profiles” (Corrosion Science, 
1985: 305-315) suggests the following cdf for the depth 
X of the deepest pit in an experiment involving the expo- 
sure of carbon manganese steel to acidified seawater. 


F(x; a, B) = ee 8 wp cy co 


The authors propose the values a = 150 and B = 90. 

Assume this to be the correct model. 

a. What is the probability that the depth of the deepest 
pit is at most 150? At most 300? Between 150 and 
300? 


110. 


111. 


112. 


b. Below what value will the depth of the maximum pit 
be observed in 90% of all such experiments? 

c. What is the density function of X? 

d. The density function can be shown to be unimodal (a 
single peak). Above what value on the measurement 
axis does this peak occur? (This value is the mode.) 

e. It can be shown that E(X) ~ .5772B + a. What is the 
mean for the given values of a and B, and how does 
it compare to the median and mode? Sketch the graph 
of the density function. [Note: This is called the larg- 
est extreme value distribution. | 


Let ¢ = the amount of sales tax a retailer owes the govern- 
ment for a certain period. The article ‘Statistical 
Sampling in Tax Audits” (Statistics and the Law, 2008: 
320-343) proposes modeling the uncertainty in ft by 
regarding it as a normally distributed random variable 
with mean value ys and standard deviation o (in the arti- 
cle, these two parameters are estimated from the results of 
a tax audit involving n sampled transactions). If a 
represents the amount the retailer is assessed, then an 
under-assessment results if tf > a and an over-assessment 
results if a > t. The proposed penalty (i.e., loss) function 
for over- or under-assessment is L(a, t) =t — aift>a 
and = k(a — f)ift=a(k> 1 is suggested to incorporate 
the idea that over-assessment is more serious than 
under-assessment). 

a. Show that a*= wp + o@"1(1/(k + 1)) is the value of 
a that minimizes the expected loss, where ®~! is the 
inverse function of the standard normal cdf. 

b. If k = 2 (suggested in the article), 4 = $100,000, and 
o = $10,000, what is the optimal value of a, and what 
is the resulting probability of over-assessment? 


The mode of a continuous distribution is the value x* that 

maximizes f(x). 

a. What is the mode of a normal distribution with 
parameters pw and a? 

b. Does the uniform distribution with parameters A and 
B have a single mode? Why or why not? 

c. What is the mode of an exponential distribution with 
parameter A? (Draw a picture.) 

d. If X has a gamma distribution with parameters a and 
B, and a > 1, find the mode. [Hint: In[f(x)] will be 
maximized iff f(x) is, and it may be simpler to take the 
derivative of In[f(x)].] 

e. What is the mode of a chi-squared distribution having 
v degrees of freedom? 


The article “Error Distribution in Navigation” (J. of the 
Institute of Navigation, 1971: 429-442) suggests that the 
frequency distribution of positive errors (magnitudes of 
errors) is well approximated by an exponential distribution. 
Let X = the lateral position error (nautical miles), which 
can be either negative or positive. Suppose the pdf of X is 


f@®) = (Dew -x<x<m 


a. Sketch a graph of f(x) and verify that f(x) is a legiti- 
mate pdf (show that it integrates to 1). 


113. 


114. 


115. 
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. Obtain the cdf of X and sketch it. 
c. Compute P(X = 0), P(X = 2), P(-1 =X = 2), and 
the probability that an error of more than 2 miles is 
made. 


The article “Statistical Behavior Modeling for Driver- 
Adaptive Precrash Systems” (IEEE Trans. on Intelligent 
Transp. Systems, 2013: 1-9) proposed the following mix- 
ture of two exponential distributions for modeling the 
behavior of what the authors called “the criticality level of 
a situation” X. 


vee + (1 — payers 
FO Aq, Ad, P) = re (1 — p)Aze 


x20 


0 otherwise 


This is often called the hyperexponential or mixed expo- 

nential distribution. This distribution is also proposed 

as a model for rainfall amount in ‘Modeling Monsoon 

Affected Rainfall of Pakistan by Point Processes” (J. of 

Water Resources Planning and Mgmnt., 1992: 671-688). 

a. Determine E(X) and V(X). Hint: For X distributed 
exponentially, E(X) = I/A and V(X) = War; what 
does this imply about E(X2)? 

b. Determine the cdf of X. 

ce. Ifp = .5,A, = 40, and Az = 200 (values of the A’s 
suggested in the cited article), calculate P(X > .01). 

d. For the parameter values given in (c), what is the 
probability that X is within one standard deviation of 
its mean value? 

e. The coefficient of variation of a random variable (or 
distribution) is CV = o/p. What is CV for an expo- 
nential rv? What can you say about the value of CV 
when X has a hyperexponential distribution? 

f. What is CV for an Erlang distribution with parameters 
A and n as defined in Exercise 68? [Note: In applied 
work, the sample CV is used to decide which of the 
three distributions might be appropriate. ] 


Suppose a particular state allows individuals filing tax 
returns to itemize deductions only if the total of all item- 
ized deductions is at least $5000. Let X (in 1000s of dol- 
lars) be the total of itemized deductions on a randomly 
chosen form. Assume that X has the pdf 


k/x® x 5 
0 otherwise 


f(x; a) = | 


a. Find the value of k. What restriction on a is neces- 
sary? 

What is the cdf of X? 

c. What is the expected total deduction on a randomly 
chosen form? What restriction on a@ is necessary for 
E(X) to be finite? 

d. Show that In(X/5) has an exponential distribution with 
parameter a — 1. 


Let J; be the input current to a transistor and J) be the 
output current. Then the current gain is proportional to 
In(/,/J,). Suppose the constant of proportionality is 1 
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116. 


117. 


118. 


119. 


120. 


(which amounts to choosing a particular unit of measure- 

ment), so that current gain = X = In(/)/J;). Assume X is 

normally distributed with x = 1 and a = .05. 

a. What type of distribution does the ratio J,//; have? 

b. What is the probability that the output current is 
more than twice the input current? 

c. What are the expected value and variance of the ratio 
of output to input current? 


The article “Response of SiC,/Si,N, Composites Under 

Static and Cyclic Loading—An Experimental and 

Statistical Analysis” (J. of Engr. Materials and 

Technology, 1997: 186-193) suggests that tensile strength 

(MPa) of composites under specified conditions can be 

modeled by a Weibull distribution with a =9 and 

B = 180. 

a. Sketch a graph of the density function. 

b. What is the probability that the strength of a ran- 
domly selected specimen will exceed 175? Will be 
between 150 and 175? 

c. If two randomly selected specimens are chosen and 
their strengths are independent of one another, what 
is the probability that at least one has a strength 
between 150 and 175? 

d. What strength value separates the weakest 10% of all 
specimens from the remaining 90%? 


Let Z have a standard normal distribution and define a 
new rv Y by Y=oZ+ uw. Show that Y has a normal 
distribution with parameters yw and o. [Hint: Y = y iff 
Z =? Use this to find the cdf of Y and then differentiate 
it with respect to y.] 


a. Suppose the lifetime X of a component, when mea- 
sured in hours, has a gamma distribution with param- 
eters a and B. Let Y = the lifetime measured in min- 
utes. Derive the pdf of Y. [Hint: Y = y iff X < y/60. 
Use this to obtain the cdf of Y and then differentiate to 
obtain the pdf.] 

b. If X has a gamma distribution with parameters a and 

B, what is the probability distribution of Y = cX? 


In Exercises 117 and 118, as well as many other situa- 
tions, one has the pdf f(x) of X and wishes to know the pdf 
of y = h(X). Assume that A( - ) is an invertible function, 
so that y = A(x) can be solved for x to yield x = k(y). 
Then it can be shown that the pdf of Y is 


gy) = fIk(y)] + |k'O) 


a. If X has a uniform distribution with A =O and 
B = 1, derive the pdf of Y = —In(X). 
b. Work Exercise 117, using this result. 


ce. Work Exercise 118(b), using this result. 


Based on data from a dart-throwing experiment, the arti- 
cle “Shooting Darts” (Chance, Summer 1997, 16-19) 
proposed that the horizontal and vertical errors from aim- 
ing at a point target should be independent of one another, 
each with a normal distribution having mean 0 and 
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122. 


variance o”. It can then be shown that the pdf of the dis- 

tance V from the target to the landing point is 

fo) =e "Py >0 
=) 

a. This pdf is a member of what family introduced in 
this chapter? 

b. If o = 20 mm (close to the value suggested in the 
paper), what is the probability that a dart will land 
within 25 mm (roughly 1 in.) of the target? 

The article ‘“‘Three Sisters Give Birth on the Same 

Day” (Chance, Spring 2001, 23-25) used the fact that 

three Utah sisters had all given birth on March 11, 1998 

as a basis for posing some interesting questions regard- 

ing birth coincidences. 

a. Disregarding leap year and assuming that the other 
365 days are equally likely, what is the probability 
that three randomly selected births all occur on 
March 11? Be sure to indicate what, if any, extra 
assumptions you are making. 

b. With the assumptions used in part (a), what is the 
probability that three randomly selected births all 
occur on the same day? 

c. The author suggested that, based on extensive data, 
the length of gestation (time between conception and 
birth) could be modeled as having a normal distribu- 
tion with mean value 280 days and standard devia- 
tion 19.88 days. The due dates for the three Utah 
sisters were March 15, April 1, and April 4, respec- 
tively. Assuming that all three due dates are at the 
mean of the distribution, what is the probability that 
all births occurred on March 11? [Hint: The devia- 
tion of birth date from due date is normally distrib- 
uted with mean 0.] 

d. Explain how you would use the information in part (c) 
to calculate the probability of a common birth date. 


Let X denote the lifetime of a component, with f(x) and 
F(x) the pdf and cdf of X. The probability that the com- 
ponent fails in the interval (x, x + Ax) is approximately 
f(x) + Ax. The conditional probability that it fails in 
(x,x + Ax) given that it has lasted at least x is 
f(x): Ax/[1 — F(x)]. Dividing this by Ax produces the 
failure rate function: 


An increasing failure rate function indicates that older 

components are increasingly likely to wear out, whereas 

a decreasing failure rate is evidence of increasing reli- 

ability with age. In practice, a “bathtub-shaped”’ failure 

is often assumed. 

a. If X is exponentially distributed, what is r(x)? 

b. If X has a Weibull distribution with parameters a and 
B, what is r(x)? For what parameter values will r(x) 
be increasing? For what parameter values will r(x) 
decrease with x? 

ec. Since r(x) = —(d/dx)In[{1 — FQd], Inf{l — F(x)] = 
—|r(x)dx. Suppose 
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Pe A 
all——] 0=Sx=s 
ra) = ( a) . 
0 otherwise 
so that if a component lasts 6 hours, it will last forever Tangent 
(while seemingly unreasonable, this model can be used to line 
study just “initial wearout’’). What are the cdf and pdf of X? 
123. Let U have a uniform distribution on the interval [0, 1]. u - 
Then observed values having this distribution can be 
obtained from a computer’s random number generator. 126. Let X have a Weibull distribution with parameters a = 2 
Let X = —(1/A)In(1 — U). and B. Show that Y = 2X?/B? has a chi-squared distribu- 
a. Show that X has an exponential distribution with tion with am 2. [Hint: The cdf of Y is P(Y = y); express 
parameter A. [Hint: The cdf of X is F(x) = P(X S x); this probability in the form P(X = g(y)), use the fact that 
X < x is equivalent to U < ?] X has a cdf of the form in Expression (4.12), and differ- 
b. How would you use part (a) and a random number entiate with respect to y to obtain the pdf of Y.] 
generator to obtain observed values from an expo- 127. An individual’s credit score is a number calculated based 
nential distribution with parameter A = 10? on that person’s credit history that helps a lender deter- 
124. Consider an rv X with mean yp and standard deviation o, and mine how much he/she should be loaned or what credit 
let 9(X) be a specified function of X. The first-order Taylor limit should be established for a credit card. An article in 
series approximation to g(X) in the neighborhood of x is the Los Angeles Times gave data which suggested that a 
beta distribution with parameters A = 150, B = 850, 
8(X) ~ g(w) + 8'(H) - (X— B) a = 8, B = 2 would provide a reasonable approximation 
The right-hand side of this equation is a linear function to the distribution of American credit scores. [Note: 
of X. If the distribution of X is concentrated in an inter- credit scores are integer-valued]. . ; 
val over which g(-) is approximately linear [e.g., V/x is a. Let X represent a randomly selected American credit 
approximately linear in (1, 2)], then the equation yields score. What are the mean value and standard deviation 
approximations to E(g(X)) and V(g(X)). of this random variable? What is the probability that X 
a. Give expressions for these approximations. [Hint: 18 within 1 standard deviation of its mean value? 
Use rules of expected value and variance for a linear b. What is the approximate probability that a randomly 
function aX + b.] selected score will exceed 750 (which lenders con- 
b. If the voltage v across a medium is fixed but current sider a very good score)? 
I is random, then resistance will also be a random 128. Let V denote rainfall volume and W denote runoff volume 
variable related to J by R=v/I. If w,= 20 and (both in mm). According to the article ‘Runoff Quality 
o, = .5, calculate approximations to jug and Gp. Analysis of Urban Catchments with Analytical 
125. A function g(x) is convex if the chord connecting any two Probability Models” (J. of Water Resource Planning 


points on the function’s graph lies above the graph. 
When g(x) is differentiable, an equivalent condition is 
that for every x, the tangent line at x lies entirely on or 
below the graph. (See the figure below.) How does 
2(w) = g(E(X)) compare to E(g(X))? [Hint: The equation 
of the tangentlineatx = wisy = g(w) + g'(w)* (x p). 
Use the condition of convexity, substitute X for x, and 
take expected values. [Note: Unless g(x) is linear, the 
resulting inequality (usually called Jensen’s inequality) 
is strict (< rather than = ); it is valid for both continu- 
ous and discrete rv’s.] 
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Joint Probability 


Distributions and 
Random Samples 


INTRODUCTION 


In Chapters 3 and 4 we developed probability models for a single random varia- 
ble. Many problems in probability and statistics involve working simultaneously 
with two or more random variables. For example, X and Y might be the height 
and weight, respectively, of a randomly selected individual. Or X,, X,, and X, 
might be the number of purchases made with Visa, MasterCard, and American 
Express credit cards, respectively, at a store on a particular day. In Section 5.1 
we discuss probability models for the joint (i.e., simultaneous) behavior of two 
or more random variables. The very important concept of independence of 
several random variables is then introduced and explored. Section 5.2 consid- 
ers the expected value of a function of two or more random variables [e.g., 
the expected value of Y/X?, the body mass index (BMI) when X is expressed 
in cm and Y is expressed in kg]. This leads to a discussion of covariance and 
correlation as measures of the degree of association between two variables. 
At the end of the section, the bivariate normal distribution is introduced as a 
generalization of the univariate normal distribution. 

Sections 5.3 and 5.4 consider functions of the n variables X,, X,..., X, 
that constitute a sample from some population or distribution (for example, a 
sample of weights of newborn children). The most important function of this 
type is the sample average (X, + X, +--- + X,)/n. We will call any such function, 
itself a random variable, a statistic. Methods from probability are used to obtain 
information about the distribution of a statistic. The premier result of this type 
is the Central Limit Theorem (CLT), the basis for many inferential procedures 
involving large sample sizes. The last section of the chapter deals with linear 
functions of the form a,X, + --- + a,X, where the a,’s are numerical constants. 


198 
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5.1 Jointly Distributed Random Variables 


There are many experimental situations in which more than one random variable (rv) 
will be of interest to an investigator. We first consider joint probability distributions 
for two random variables. The “pure” cases, in which both variables are discrete or 
both are continuous, are the ones most frequently encountered in practice. 


Two Discrete Random Variables 


The probability mass function (pmf) of a single discrete rv X specifies how much prob- 
ability mass is placed on each possible X value. The joint pmf of two discrete rv’s X and 
Y describes how much probability mass is placed on each possible pair of values (x, y). 


DEFINITION Let X and Y be two discrete rv’s defined on the sample space £ of an experi- 
ment. The joint probability mass function p(x, y) is defined for each pair of 
numbers (x, y) by 


D(x, y) = PX =x and Y= y) 
It must be the case that p(x, y) = 0 and > > pG@, y) = 1. 
% y 


Now let A be any particular set consisting of pairs of (x, y) values (e.g., 
A= {(x, y): x + y =5} or {(, y): max(x, y) S 3}). Then the probability 
P(X, Y) € A] that the random pair (X, Y) lies in the set A is obtained by sum- 
ming the joint pmf over pairs in A: 


P(X, Y) € A] = > Dd) p@y) 


(x.y) EA 


EXAMPLE 5.1 Anyone who purchases an insurance policy for a home or automobile must specify a 
deductible amount, the amount of loss to be absorbed by the policyholder before the 
insurance company begins paying out. Suppose that a particular company offers auto 
deductible amounts of $100, $500, and $1000, and homeowner deductible amounts 
of $500, $1000, and $2000. Consider randomly selecting someone who has both 
auto and homeowner insurance with this company, and let X = the amount of the 
auto policy deductible and Y = the amount of the homeowner policy deductible. The 
joint pmf of these two variables appears in the accompanying joint probability table: 


y 
D(x, y) 500 1000 5000 
100 30 05, 0 
x 500 15 .20 .05 
1000 .10 .10 .05 


According to this joint pmf, there are nine possible (X, Y) pairs: (100, 500), (100, 
1000), ..., and finally (1000, 5000). The probability of (100, 500) is p(100, 500) = 
P(X = 100, Y = 500) = .30. Clearly p(x, y) = 0, and it is easily confirmed that the 
sum of the nine displayed probabilities is 1. The probability P(X = Y) is computed 
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by summing p(x, y) over the two (x, y) pairs for which the two deductible amounts 
are identical: 


P(X = Y) = p(500, 500) + p(1000, 1000) = .15 + .10 = .25 


Similarly, the probability that the auto deductible amount is at least $500 is the sum 
of all probabilities corresponding to (x, y) pairs for which x = 500; this is the sum 
of the probabilities in the bottom two rows of the joint probability table: 


P(X = 500) = .15 + .20 + .05 + .10 + .10 + .05 = .65 a 


Once the joint pmf of the two variables X and Y is available, it is in principle 
straightforward to obtain the distribution of just one of these variables. As an example, 
let X and Y be the number of statistics and mathematics courses, respectively, currently 
being taken by a randomly selected statistics major. Suppose that we wish the distri- 
bution of X, and that when X = 2, the only possible values of Y are 0, 1, and 2. Then 


Px(2) = P(X = 2) = P[(X, Y) = (2, 0) or (2, 1) or (2, 2)] 
= p(2, 0) + p2, 1) + p(2, 2) 
That is, the joint pmf is summed over all pairs of the form (2, y). More generally, 
for any possible value x of X, the probability p,(x) results from holding x fixed and 


summing the joint pmf p(x, y) over all y for which the pair (x, y) has positive prob- 
ability mass. The same strategy applies to obtaining the distribution of Y by itself. 


DEFINITION The marginal probability mass function of X, denoted by p,(x), is given by 


Px) = > P(x, y) for each possible value x 


y: p(x, y)>0 


Similarly, the marginal probability mass function of Y is 


Py) = > ptx,y) for each possible value y. 


x: px, y)>0 


The use of the word marginal here is a consequence of the fact that if the joint pmf is 
displayed in a rectangular table as in Example 5.1, then the row totals give the marginal 
pmf of X and the column totals give the marginal pmf of Y. Once these marginal pmf’s 
are available, the probability of any event involving only X or only Y can be calculated. 


EXAMPLE 5.2 Possible X values are x = 100, 500, and 1000. Computing row totals from the joint 

(Example 5.1 probability table yields 

contnued) px(100) = p(100, 500) + p(100, 1000) + p(100, 5000) = .30 + .05 + 0 = .35 
py(500) = .15 + .20 + .05 = .40, py(1000) = 1 — (.35 + .40) = .25 


The marginal pmf of X is then 


35 x= 100 

40 x=500 
Px) =) 55 x= 1000 

QO otherwise 
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From this pmf, P(X 2 500) = .40 + .25 = .65, which we already calculated in 
Example 5.1. Similarly, the marginal pmf of Y is obtained from the column totals as 


55 y= 500 
35 y= 1000 
PY =4 19 y= 5000 
Q otherwise | 


Two Continuous Random Variables 


The probability that the observed value of a continuous rv X lies in a one-dimen- 
sional set A (such as an interval) is obtained by integrating the pdf f(x) over the 
set A. Similarly, the probability that the pair (X, Y) of continuous rv’s falls in a 
two-dimensional set A (such as a rectangle) is obtained by integrating a function called 
the joint density function. 


DEFINITION Let X and Y be continuous rv’s. A joint probability density function 
f(x, y) for these two variables is a function satisfying f(x, y)=0O and 
f°. [°..f@, y)dx dy = 1. Then for any two-dimensional set A 


Mea) | [re: y)dx dy 


A 


In particular, if A is the two-dimensional rectangle {(x, y):a=x=b,c=y=d}, 
then 


b cd 
P[(X,Y) € A] =Pla=X<b,c=¥=d=| | fo vas 


(s 


We can visualize f(x, y) as specifying a surface at height f(x, y) above the point 
(x, y) in a three-dimensional coordinate system. Then P[(X, Y) € A] is the volume 
underneath this surface and above the region A, analogous to the area under a curve 
in the case of a single rv. This is illustrated in Figure 5.1. 


SQ, y) 
Surface f(x, y) 


‘A = Shaded 
rectangle 
- 


Figure 5.1 P(X, Y) € A] = volume under density surface above A 


EXAMPLE 5.3. A bank operates both a drive-up facility and a walk-up window. On a randomly 
selected day, let X = the proportion of time that the drive-up facility is in use (at least 
one customer is being served or waiting to be served) and Y = the proportion of time 
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that the walk-up window is in use. Then the set of possible values for (X, Y) is the rec- 
tangle D = {(x, y):0 =x = 1,0 = y <1}. Suppose the joint pdf of (X, Y) is given by 


6 
—(xt+y) O0<x<1,0<y<=l1 
fey=45- °° 


0 otherwise 


To verify that this is a legitimate pdf, note that f(x, y) = 0 and 


io foe) 1 
| | T(x, y)dx dy = | | Oe + y’) dx dy 
0 


ghee aa 
16 1 16 
-| | ox dx dy + | | —y dx dy 
o Jo 5 oJo 5 


6 16 6 6 
=| —xdx+ 2dy=— + —=1 
[gaa | & ~~ 10° 15 


The probability that neither facility is busy more than one-quarter of the time is 


1 ) 14 71/4 6 
Piosxs—085Ys —- -| | —(x + y’) dx dy 
4 4 0 Jo 3 
6 [14 pia 6 [l4 pu 
=¢| | xdrdy +o | | y” dx dy 
Ig 1G SJo Jo 
6 xtyanis r 6 ypr 7 
20 2 |x=0 20 3 | y=0 640 
= .0109 & 


The marginal pdf of each variable can be obtained in a manner analogous to what we 
did in the case of two discrete variables. The marginal pdf of X at the value x results 
from holding x fixed in the pair (x, y) and integrating the joint pdf over y. Integrating 
the joint pdf with respect to x gives the marginal pdf of Y. 


DEFINITION The marginal probability density functions of X and Y, denoted by f,(x) and 
fy), respectively, are given by 


oo 


ELIS | FQ, y) dy for —-~<x< 0 


—o 


oo 


fo) =| FG, y) dx for -~<y<o 


—oo 


EXAMPLE 5.4 The marginal pdf of X, which gives the probability distribution of busy time for the 
(Example 5.3 drive-up facility without reference to the walk-up window, is 
continued) cs i¢ 6 2 
Pe — 2, = 
SQ) = | S(%, y)dy = | ger yay = kt 
- 09 5 5 
for 0 = x = 1 and 0 otherwise. The marginal pdf of Y is 


o3 + 2 O<y<l 
fr) = 45 5 
0) otherwise 
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Then 


5 37 
P(.25 <Y<.75) = | fly) dy = —~ = 4625 ] 
25 80 


In Example 5.3, the region of positive joint density was a rectangle, which made 
computation of the marginal pdf’s relatively easy. Consider now an example in 
which the region of positive density is more complicated. 


EXAMPLE 5.5 A nut company markets cans of deluxe mixed nuts containing almonds, cashews, and 
peanuts. Suppose the net weight of each can is exactly | lb, but the weight contribu- 
tion of each type of nut is random. Because the three weights sum to 1, a joint prob- 
ability model for any two gives all necessary information about the weight of the third 
type. Let X = the weight of almonds in a selected can and Y = the weight of cashews. 
Then the region of positive density is D = {(x,y):0Sx=10SySl1xt+ysl}, 
the shaded region pictured in Figure 5.2. 


(0, 1) 


@ L= 3%) 


x (1, 0) x 


Figure 5.2 Region of positive density for Example 5.5 


Now let the joint pdf for (X, Y) be 


_— j24xy OSxS10Syslxtysl 

Oy) 0 — otherwise 
For any fixed x, f(x, y) increases with y; for fixed y, f(x, y) increases with x. 
This is appropriate because the word deluxe implies that most of the can should 
consist of almonds and cashews rather than peanuts, so that the density function 
should be large near the upper boundary and small near the origin. The surface 
determined by f(x, y) slopes upward from zero as (x, y) moves away from either 
axis. 

Clearly, f(x, y) = 0. To verify the second condition on a joint pdf, recall that 
a double integral is computed as an iterated integral by holding one variable fixed 
(such as x as in Figure 5.2), integrating over values of the other variable lying along 
the straight line passing through the value of the fixed variable, and finally integrat- 
ing over all possible values of the fixed variable. Thus 


ice) ioe) 1 
| | S(, y) dy dx = | freesravac= [| 
a D 


y=1-x 1 
Jac | 12x(1 — xP dx = 1 
0 0 


1-x 
| 24xy ‘| dx 
0 
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To compute the probability that the two types of nuts together make up at most 
50% of the can, let A = {(x, y): OS x51,0Sy<1, andx+y=.5}, as shown 
in Figure 5.3. Then 


=X 


5 
P(X, Y) © A) = | | re: y) dx dy = | | 24xy dy dx = .0625 
A 0 Jo 


A = Shaded region 


Figure 5.3. Computing P[(X, Y) € Al] for Example 5.5 


The marginal pdf for almonds is obtained by holding X fixed at x and integrating the 
joint pdf f(x, y) along the vertical line through x: 


l-x 
* 24xy dy = 12x11 -—x? OSx<1 
or fl, y) dy = i os 


0 otherwise 


By symmetry of f(x, y) and the region D, the marginal pdf of Y is obtained by replac- 
ing x and X in f,(x) by y and Y, respectively. fo 


Independent Random Variables 


In many situations, information about the observed value of one of the two variables 
X and Y gives information about the value of the other variable. In Example 5.1, the 
marginal probability of X at x = 100 is .35 and at x = 1000 is .25. However, if we 
learn that Y = 5000, the last column of the joint probability table tells us that X can’t 
possibly be 100 and the other two possibilities, 500 and 1000, are now equally likely. 
Thus knowing the value of Y changes the distribution of X; in such situations it is 
natural to say that there is a dependence between the two variables. 

In Chapter 2, we pointed out that one way of defining independence of two 
events is via the condition P(A M B) = P(A) - P(B). Here is an analogous definition 
for the independence of two rv’s. 


DEFINITION Two random variables X and Y are said to be independent if for every pair of 
x and y values 


p@, Y) = px(x) - py) when X and Y are discrete 
or (5.1) 


SI, y) = fe) - fy) — when X and Y are continuous 


If (5.1) is not satisfied for all (x, y), then X and Y are said to be dependent. 
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The definition says that two variables are independent if their joint pmf or pdf is 
the product of the two marginal pmf’s or pdf’s. Intuitively, independence says that 
knowing the value of one of the variables does not provide additional information 
about what the value of the other variable might be. That is, the distribution of one 
variable does not depend on the value of the other variable. 


EXAMPLE 5.6 In the insurance situation of Examples 5.1 and 5.2, 
p(1000, 5000) = .05 ¥ (.10)(.25) = py(1000) - p(5000) 


so X and Y are not independent. In fact, the joint probability table has an entry which 
is 0, yet the corresponding row and column totals are both positive. Independence of 
X and Y requires that every entry in the joint probability table be the product of the 
corresponding row and column marginal probabilities. ia 


EXAMPLE 5.7 Because f(x, y) has the form of a product, X and Y would appear to be independent. 
(Example 5.5 However, although f,(3/4) = f,(3/4) = 9/16, f(3/4,3/4) = 0 # 9/16 - 9/16, so 
continued) the variables are not in fact independent. To be independent, f(x, y) must have the 
form g(x) - h(y) and the region of positive density must be a rectangle whose sides 
are parallel to the coordinate axes. a 


Independence of two random variables is most useful when the description 
of the experiment under study suggests that X and Y have no effect on one another. 
Then once the marginal pmf’s or pdf’s have been specified, the joint pmf or pdf is 
simply the product of the two marginal functions. It follows that 


PasxsbcSY¥sdad=Pasxsb):P(ecsYsad) 


EXAMPLE 5.8 Suppose that the lifetimes of two components are independent of one another 
and that the first lifetime, X,, has an exponential distribution with parameter A,, 
whereas the second, X,, has an exponential distribution with parameter A,. Then 
the joint pdf is 


F(X, Xz) =f) . fx) 


= Aye A + Nyse = V,A,E AMM x, > 0.x, > 0 
0) otherwise 


Let A, = 1/1000 and A, = 1/1200, so that the expected lifetimes are 1000 hours and 
1200 hours, respectively. The probability that both component lifetimes are at least 
1500 hours is 


P(1500 = X,, 1500 < X,) = P(1500 = X,) - P(1500 < X,) 


= e7Ai(1500) . @—A2(1500) 


= (.2231)(.2865) = .0639 a 


More Than Two Random Variables 


To model the joint behavior of more than two random variables, we extend the con- 
cept of a joint distribution of two variables. 
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DEFINITION If X,, X,,..., X,, are all discrete random variables, the joint pmf of the variables 
is the function 


POG sk) = OG =X Xo = 


If the variables are continuous, the joint pdf of X,,..., X, is the function 
SX, x, ..., X,) Such that for any n intervals [a,, b,],..., [a,, b,], 


n? 


b, Bs 
PQ =X) =o). ad, =x = b= | = MOR econ dh ORS cas PS 


1 a 


n 


EXAMPLE 5.9 A binomial experiment consists of n dichotomous (success—failure), homogenous 
(constant success probability) independent trials. Now consider a trinomial experi- 
ment in which each of the n trials can result in one of three possible outcomes. For 
example, each successive customer at a store might pay with cash, a credit card, or 
a debit card. The trials are assumed independent. Let p, = P(trial results in a type 
1 outcome) and define p, and p, analogously for type 2 and type 3 outcomes. The 
random variables of interest here are X; = the number of trials that result in a type 
i outcome for i = 1, 2, 3. 

Inn = 10trials, the probability that the first five are type | outcomes, the next 
three are type 2, and the last two are type 3—that is, the probability of the experi- 
mental outcome 1111122233—is p>? - p} - p3. This is also the probability of the 
outcome 1122311123, and in fact the probability of any outcome that has exactly 
five 1’s, three 2’s, and two 3’s. Now to determine the probability P(X, = 5, X, = 3, 
and X, = 2), we have to count the number of outcomes that have exactly five 1’s, 
three 2’s, and two 3’s. First, there are () ways to choose five of the trials to be the 
type 1 outcomes. Now from the remaining five trials, we choose three to be the type 
2 outcomes, which can be done in (7) ways. This determines the remaining two 
trials, which consist of type 3 outcomes. So the total number of ways of choosing 
five 1’s, three 2’s, and two 3’s is 


10 5 10! 5! 10! 
(°}= =~. == = 2520 
5 3 SIS! 312!) 51312! 
Thus we see that P(X, = 5, X, = 3, X, = 2) = 2520 p} - p3 - p3. Generalizing this 
to n trials gives 


n! 


P(X), Xp, X3) = P(X, = X 1, Xy = Xp, Xz = X3) ht ) PPE PS 
xX !xy!x5! 


for x, = 0, 1, 2,...; x, = 0, 1, 2,...; x, = 0, 1, 2,... such that x, + x, + x, =n. 
Notice that whereas there are three random variables here, the third variable X, is 
actually redundant. For example, in the case n = 10, having X, = 5 and X, = 3 
implies that X, = 2 (just as in a binomial experiment there are actually two rv’s—the 
number of successes and number of failures—but the latter is redundant). 

As a specific example, the genetic allele of a pea section can be either AA, Aa, 
or aa. A simple genetic model specifies P(AA) = .25, P(Aa) = .50, and P(aa) = .25. 
If the alleles of 10 independently obtained sections are determined, the probability 
that exactly five of these are Aa and two are AA is 


! 
2!5!3! 


p(2, 5,3) = (.25)°(.50)°(.25)? = 0.769 
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A natural extension of the trinomial scenario is an experiment consisting of n 
independent and identical trials, in which each trial can result in any one of r pos- 
sible outcomes. Let p, = P(outcome i on any particular trial), and define random 
variables by X; = the number of trials resulting in outcome 7 (i = 1,..., r). This 
is called a multinomial experiment, and the joint pmf of X,,..., X, is called the 
multinomial distribution. An argument analogous to the one used to derive the 
trinomial pmf gives the multinomial pmf as 


PQijr-023%;) 


n! ; 
= 2 (x,!)Q%5!). ... . ,!) 


0) otherwise 


pi... pr x, =0,1,2,...5 xytoctx,Hn 


EXAMPLE 5.10 When acertain method is used to collect a fixed volume of rock samples in a region, 
there are four resulting rock types. Let X,, X,, and X, denote the proportion by vol- 
ume of rock types 1, 2, and 3 in a randomly selected sample (the proportion of rock 
type 4 is 1 — X, — X, — X;, so a variable X, would be redundant). If the joint pdf 
of X,, X,, X; is 


F(X), Xa, X3) 


_ jkuyx,1 -—x,) 08%, 51,084, 51,05%,51,x, +x, +2x,51 
0 otherwise 


then k is determined by 


| | | F(X), Xz, X3) dx; dx, dx, 


—oo J —oo J —co 


1 1x, 1l>x, x, 
= | | | kx ,x,(1 — x3) us| us| dx, 
0 Wo 0 


This iterated integral has value k/144, so k = 144. The probability that rocks of types 
1 and 2 together account for at most 50% of the sample is 


P(X, + X,= 5) = i F(X, X, X3) dx3 dx, dx, 


| 0 <x, <1 fori =1,2,3 } 


x, tx, +4,51,x4,+%,5.5 
Tox, -x5 
| 144x .x,(1 — x3) as| drs 


Ae L 


.6066 a 


The notion of independence of more than two random variables is similar to the 
notion of independence of more than two events. 


DEFINITION The random variables X,, X,,..., X,, are said to be independent if for every 
subset X;, X;,..., X;, of the variables (each pair, each triple, and so on), the joint 
pmf or pdf of the subset is equal to the product of the marginal pmf’s or pdf’s. 
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Thus if the variables are independent with n = 4, then the joint pmf or pdf of any two 
variables is the product of the two marginals, and similarly for any three variables and 
all four variables together. Intuitively, independence means that learning the values 
of some variables doesn’t change the distribution of the remaining variables. Most 
importantly, once we are told that n variables are independent, then the joint pmf or 
pdf is the product of the n marginals. 


EXAMPLE 5.11 If Xj,..., X,, represent the lifetimes of n components, the components operate 
independently of one another, and each lifetime is exponentially distributed with 
parameter A, then for x, = 0, x, 20,...,x, 2 0, 


FX ys Xo000y Xp) = (NEM) + Ae?) vr (NOTA) = METAR 


Suppose a system consisting of these components will fail as soon as a single com- 
ponent fails. Let T represent system lifetime. Then the probability that the system 
lasts past time f is 


PIT >t) = PX, >t,...,X, > D -| eI fF Qijgees Hp) A650 0X, 
t 


t 


-(| he ax,)..{| ned] 
t t 


= (ey — emt 


Therefore, 
P(system lifetime S tf) = 1 — e7™ fort =0 


which shows that system lifetime has an exponential distribution with parameter nA; 
the expected value of system lifetime is 1/nd. 

A variation on the foregoing scenario appeared in the article ‘‘A Method for 
Correlating Field Life Degradation with Reliability Prediction for Electronic 
Modules” (Quality and Reliability Engr. Intl., 2005: 715-726). The investigators 
considered a circuit card with n soldered chip resistors. The failure time of a card is 
the minimum of the individual solder connection failure times (mileages here). It was 
assumed that the solder connection failure mileages were independent, that failure 
mileage would exceed ¢ if and only if the shear strength of a connection exceeded a 
threshold d, and that each shear strength was normally distributed with a mean value 
and standard deviation that depended on the value of mileage t: u(t) = a, — ayt and 
o(t) = a, + a,t (a weld’s shear strength typically deteriorates and becomes more 
variable as mileage increases). Then the probability that the failure mileage of a card 


exceeds f is 
d— = Got\\" 
P(T>t = ( o/ i= % ) 


a, + ayt 


The cited article suggested values for d and the a,’s based on data. In contrast to the 
exponential scenario, normality of individual lifetimes does not imply normality of 
system lifetime. | 


In many experimental situations to be considered in this book, independence is 
a reasonable assumption, so that specifying the joint distribution reduces to deciding 
on appropriate marginal distributions. 
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Conditional Distributions 


Suppose X = the number of major defects in a randomly selected new automobile 
and Y = the number of minor defects in that same auto. If we learn that the selected 
car has one major defect, what now is the probability that the car has at most three 
minor defects—that is, what is P(Y S 3 |X = 1)? Similarly, if X and Y denote the 
lifetimes of the front and rear tires on a motorcycle, and it happens that X = 10,000 
miles, what now is the probability that Y is at most 15,000 miles, and what is the 
expected lifetime of the rear tire “conditional on” this value of X? Questions of this 
sort can be answered by studying conditional probability distributions. 


DEFINITION Let X and Y be two continuous rv’s with joint pdf f(x, y) and marginal X pdf 
f(x). Then for any X value x for which f(x) > 0, the conditional probability 
density function of Y given that X = x is 

fy) 

AQ) 

If X and Y are discrete, replacing pdf’s by pmf’s in this definition gives the 
conditional probability mass function of Y when X = x. 


Fix = 


Sa OO Vn OO 


Notice that the definition of fy, y(y|x) parallels that of P(B|A), the conditional 
probability that B will occur, given that A has occurred. Once the conditional 
pdf or pmf has been determined, questions of the type posed at the outset of this 
subsection can be answered by integrating or summing over an appropriate set 
of Y values. 


EXAMPLE 5.12 Reconsider the situation of Examples 5.3 and 5.4 involving X = the proportion of 
time that a bank’s drive-up facility is busy and Y = the analogous proportion for the 
walk-up window. The conditional pdf of Y given that X = .8 is 


f(8,y)  1.208+y) 1 . 
7S Were a ee ee 


Frjx (9.8) a 


The probability that the walk-up facility is busy at most half the time given that 
X = .8 is then 


a} | 


1 
P(Y < .5|X = .8) = | frjx|-8)dy = | aq (24 + 30y%) dy = 390 
—o 0 


Using the marginal pdf of Y gives P(Y = .5) = .350. Also E(Y) = .6, whereas the 


expected proportion of time that the walk-up facility is busy given that X = .8 (a 
conditional expectation) is 


E(Y|X = .8) = | y+ fyxG-8)dy = xl, y24 + 30y)dy=.574 - 


If the two variables are independent, the marginal pmf or pdf in the denominator will 
cancel the corresponding factor in the numerator. The conditional distribution is then 
identical to the corresponding marginal distribution. 
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EXERCISES Section 5.1 (1-21) 


islands. On each island, there is a single regular unleaded 
pump with two hoses. Let X denote the number of hoses 
being used on the self-service island at a particular time, 
and let Y denote the number of hoses on the full-service 
island in use at that time. The joint pmf of X and Y appears 
in the accompanying tabulation. 


y 

p(x y) ae 1 2 
0 10 04 02 

x 1 08 20 .06 
2 06 14 30 


a. What is P(X = 1 and Y = 1)? 

b. Compute P(X = 1 and Y= 1). 

c. Give a word description of the event {X #0 and 
Y # 0}, and compute the probability of this event. 

d. Compute the marginal pmf of X and of Y. Using 
Py(x), what is P(X = 1)? 

e. Are X and Y independent rv’s? Explain. 


A large but sparsely populated county has two small 
hospitals, one at the south end of the county and the 
other at the north end. The south hospital’s emergency 
room has four beds, whereas the north hospital’s emer- 
gency room has only three beds. Let X denote the num- 
ber of south beds occupied at a particular time on a given 
day, and let Y denote the number of north beds occupied 
at the same time on the same day. Suppose that these two 
rv’s are independent; that the pmf of X puts probability 

masses .1, .2, .3, .2, and .2 on the x values 0, 1, 2, 3, 

and 4, respectively; and that the pmf of Y distributes 

probabilities .1, .3, .4, and .2 on the y values 0, 1, 2, and 

3, respectively. 

a. Display the joint pmf of X and Y in a joint probabil- 
ity table. 

b. Compute P(X <= 1 and Y = 1) by adding probabilities 
from the joint pmf, and verify that this equals the 
product of P(X = 1) and P(Y = 1). 

c. Express the event that the total number of beds occu- 
pied at the two hospitals combined is at most | in 
terms of X and Y, and then calculate this probability. 

d. What is the probability that at least one of the two 
hospitals has no beds occupied? 


A certain market has both an express checkout line and 
a superexpress checkout line. Let X, denote the number 
of customers in line at the express checkout at a par- 
ticular time of day, and let X, denote the number of 
customers in line at the superexpress checkout at the 
same time. Suppose the joint pmf of X, and X, is as 
given in the accompanying table. 
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1. A service station has both self-service and full-service Xp 


0 1 2 3 

0 08 07 04 00 

1 06 15 05 04 

x y 05 04 10 06 
3 00 03 04 07 

4 00 Ol 05 06 


a. What is P(X, = 1, X, = 1), that is, the probability 
that there is exactly one customer in each line? 

b. What is P(X, = X,), that is, the probability that the 
numbers of customers in the two lines are identical? 

c. Let A denote the event that there are at least two 
more customers in one line than in the other line. 
Express A in terms of X, and X,, and calculate the 
probability of this event. 

d. What is the probability that the total number of cus- 
tomers in the two lines is exactly four? At least four? 


Return to the situation described in Exercise 3. 

a. Determine the marginal pmf of X,, and then calculate 
the expected number of customers in line at the 
express checkout. 

Determine the marginal pmf of X;. 

c. By inspection of the probabilities P(X, = 4), 
P(X, = 0), and P(X, = 4, X, = 0), are X, and X, 
independent random variables? Explain. 


The number of customers waiting for gift-wrap service at 
a department store is an rv X with possible values 0, 1, 2, 
3, 4 and corresponding probabilities .1, .2, .3, .25, .15.A 
randomly selected customer will have 1, 2, or 3 packages 
for wrapping with probabilities .6, .3, and .1, respectively. 
Let Y = the total number of packages to be wrapped for 
the customers waiting in line (assume that the number of 
packages submitted by one customer is independent of 
the number submitted by any other customer). 

a. Determine P(X = 3, Y = 3), ie., p(3, 3). 

b. Determine p(4, 11). 

Let X denote the number of Canon SLR cameras sold 


during a particular week by a certain store. The pmf of X 
is 


x 0 1 2 3 4 


Px) al 2, 3 25, 215 


Sixty percent of all customers who purchase these 

cameras also buy an extended warranty. Let Y denote 

the number of purchasers during this week who buy an 

extended warranty. 

a. What is P(X = 4, Y = 2)? [Hint: This probability 
equals P(Y = 2| X = 4) - P(X = 4); now think of 
the four purchases as four trials of a binomial 


experiment, with success on a trial corresponding to 
buying an extended warranty. ] 
Calculate P(X = Y). 

c. Determine the joint pmf of X and Y and then the 
marginal pmf of Y. 


The joint probability distribution of the number X of 
cars and the number Y of buses per signal cycle at a 
proposed left-turn lane is displayed in the accompany- 
ing joint probability table. 


y 
px, y) 0 1 2 
0 025 .015 010 
1 050 .030 020 
2 125 .075 .050 
x 3 .150 .090 .060 
4 .100 .060 .040 
5 .050 .030 .020 


a. What is the probability that there is exactly one car 
and exactly one bus during a cycle? 

b. What is the probability that there is at most one car 
and at most one bus during a cycle? 

c. What is the probability that there is exactly one car 
during a cycle? Exactly one bus? 

d. Suppose the left-turn lane is to have a capacity of five 
cars, and that one bus is equivalent to three cars. What 
is the probability of an overflow during a cycle? 

e. Are X and Y independent rv’s? Explain. 


A stockroom currently has 30 components of a certain 

type, of which 8 were provided by supplier 1, 10 by sup- 

plier 2, and 12 by supplier 3. Six of these are to be ran- 
domly selected for a particular assembly. Let X = the 
number of supplier 1’s components selected, Y = the 

number of supplier 2’s components selected, and p(x, y) 

denote the joint pmf of X and Y. 

a. What is p(3, 2)? [Hint: Each sample of size 6 is equally 
likely to be selected. Therefore, p(3, 2) = (number of 
outcomes with X = 3 and Y= 2)/(total number of 
outcomes). Now use the product rule for counting to 
obtain the numerator and denominator.] 

b. Using the logic of part (a), obtain p(x, y). (This can 
be thought of as a multivariate hypergeometric 
distribution—sampling without replacement from a 
finite population consisting of more than two 
categories.) 


Each front tire on a particular type of vehicle is supposed 
to be filled to a pressure of 26 psi. Suppose the actual air 
pressure in each tire is a random variable—X for the right 
tire and Y for the left tire, with joint pdf 


K(x2 +?) 20<x <30,20<y <30 
f(x, y) = 


0 otherwise 
a. What is the value of K? 
b. What is the probability that both tires are underfilled? 
c. What is the probability that the difference in air pres- 
sure between the two tires is at most 2 psi? 


10. 


11. 


12. 


13. 
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d. Determine the (marginal) distribution of air pressure 
in the right tire alone. 
e. Are X and Y independent rv’s? 


Annie and Alvie have agreed to meet between 5:00 p.m. 

and 6:00 p.m. for dinner at a local health-food restaurant. 

Let X =Annie’s arrival time and Y = Alvie’s arrival 

time. Suppose X and Y are independent with each uni- 

formly distributed on the interval [5, 6]. 

a. What is the joint pdf of X and Y? 

b. What is the probability that they both arrive between 
5:15 and 5:45? 

c. If the first one to arrive will wait only 10 min before 
leaving to eat elsewhere, what is the probability that 
they have dinner at the health-food restaurant? [Hint: 
The event of interest is A = {(x, y):|x—y|] S 1/6}.] 


Two different professors have just submitted final exams for 
duplication. Let X denote the number of typographical 
errors on the first professor’s exam and Y denote the number 
of such errors on the second exam. Suppose X has a Poisson 
distribution with parameter j2,, Y has a Poisson distribution 
with parameter j1,, and X and Y are independent. 

a. What is the joint pmf of X and Y? 

b. What is the probability that at most one error is made 
on both exams combined? 

c. Obtain a general expression for the probability that 
the total number of errors in the two exams is m 
(where m is a nonnegative integer). [Hint: A = 
{@ y): x+y =m} = {(m, 0), (m—-1, 1,..., 
(1, m — 1), (0, m)}. Now sum the joint pmf over 
(x, y) € A and use the binomial theorem, which 
says that 


m 


> ("eon = (a + by" 


k=0 
for any a, b.] 


Two components of a minicomputer have the following 
joint pdf for their useful lifetimes X and Y: 


xe*T+y) xy >Oandy=0 


0 otherwise 


fy) = | 


a. What is the probability that the lifetime X of the first 
component exceeds 3? 

b. What are the marginal pdf’s of X and Y? Are the two 
lifetimes independent? Explain. 

c. What is the probability that the lifetime of at least 
one component exceeds 3? 


You have two lightbulbs for a particular lamp. Let 
X = the lifetime of the first bulb and Y = the lifetime of 
the second bulb (both in 1000s of hours). Suppose that X 
and Y are independent and that each has an exponential 
distribution with parameter A = 1. 

a. What is the joint pdf of X and Y? 

b. What is the probability that each bulb lasts at most 

1000 hours (i.e., X = | and Y= 1)? 
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14. 


15. 


16. 


17. 


c. What is the probability that the total lifetime of the 
two bulbs is at most 2? [Hint: Draw a picture of the 
region A = {(x, y): x20, y2=0,x + y S 2} before 
integrating. ] 

d. What is the probability that the total lifetime is 
between | and 2? 


Suppose that you have ten lightbulbs, that the lifetime of 

each is independent of all the other lifetimes, and that each 

lifetime has an exponential distribution with parameter A. 

a. What is the probability that all ten bulbs fail before 
time f? 

b. What is the probability that exactly k of the ten bulbs 
fail before time ¢? 

c. Suppose that nine of the bulbs have lifetimes that are 
exponentially distributed with parameter A and that 
the remaining bulb has a lifetime that is exponen- 
tially distributed with parameter 6 (it is made by 
another manufacturer). What is the probability that 
exactly five of the ten bulbs fail before time r? 


Consider a system consisting of three components as 
pictured. The system will continue to function as long as 
the first component functions and either component 2 or 
component 3 functions. Let X,, X,, and X,; denote the 
lifetimes of components 1, 2, and 3, respectively. 
Suppose the X;,’s are independent of one another and each 
X, has an exponential distribution with parameter A. 


» 


a. Let Y denote the system lifetime. Obtain the cumula- 
tive distribution function of Y and differentiate to 
obtain the pdf. [Hint: F(y) = P(Y = y); express the 
event { Y = y} in terms of unions and/or intersections 
of the three events {X,=y}, {X,=y}, and 
(X%4 = y}.] 

b. Compute the expected system lifetime. 


a. For f(x), %, x;) as given in Example 5.10, compute 
the joint marginal density function of X, and X, 
alone (by integrating over x,). 

b. What is the probability that rocks of types 1 and 3 
together make up at most 50% of the sample? [Hint: 
Use the result of part (a).] 

c. Compute the marginal pdf of X, alone. [Hint: Use the 
result of part (a).] 


An ecologist wishes to select a point inside a circular 
sampling region according to a uniform distribution (in 
practice this could be done by first selecting a direction 
and then a distance from the center in that direction). 
Let X = the x coordinate of the point selected and 
Y = the y coordinate of the point selected. If the circle 
is centered at (0, 0) and has radius R, then the joint pdf 
of X and Y is 
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18. 


19. 


20. 


21. 


a 
aR 


0 otherwise 


vr+y sR 


fy) = 


a. What is the probability that the selected point is 
within R/2 of the center of the circular region? 
[Hint: Draw a picture of the region of positive 
density D. Because f(x, y) is constant on D, com- 
puting a probability reduces to computing an area. ] 

b. What is the probability that both X and Y differ from 
0 by at most R/2? 

c. Answer part (b) for R/V2 replacing R/2. 

d. What is the marginal pdf of X? Of Y? Are X and Y 
independent? 


Refer to Exercise 1 and answer the following questions: 
a. Given that X = 1, determine the conditional pmf of 
Y—ie., Py; xO | 1), Py; x1 | 1), and py, x(2 | 1). 

b. Given that two hoses are in use at the self-service 
island, what is the conditional pmf of the number of 
hoses in use on the full-service island? 

c. Use the result of part (b) to calculate the conditional 
probability P(Y = 1 | X = 2). 

d. Given that two hoses are in use at the full-service 
island, what is the conditional pmf of the number in 
use at the self-service island? 


The joint pdf of pressures for right and left front tires is 

given in Exercise 9. 

a. Determine the conditional pdf of Y given that X = x 
and the conditional pdf of X given that Y = y. 

b. If the pressure in the right tire is found to be 22 psi, 
what is the probability that the left tire has a pressure 
of at least 25 psi? Compare this to P(Y = 25). 

c. If the pressure in the right tire is found to be 22 psi, 
what is the expected pressure in the left tire, and 
what is the standard deviation of pressure in this tire? 


Let X,, X,, X3, X,, X;, and X, denote the numbers of blue, 

brown, green, orange, red, and yellow M&M candies, 

respectively, in a sample of size n. Then these X,’s have 

a multinomial distribution. According to the M&M Web 

site, the color proportions are p, = .24, p, = .13, 

Pz = .16, py = .20, ps = .13, and p, = .14. 

a. If n= 12, what is the probability that there are 
exactly two M&Ms of each color? 

b. For n = 20, what is the probability that there are at 
most five orange candies? [Hint: Think of an orange 
candy as a success and any other color as a failure. ] 

c. Inasample of 20 M&Ms, what is the probability that 
the number of candies that are blue, green, or orange 
is at least 10? 


Let X,, X,, and X, be the lifetimes of components 1, 2, 

and 3 in a three-component system. 

a. How would you define the conditional pdf of X,; 
given that X, = x, and X, = x,? 

b. How would you define the conditional joint pdf of X, 
and X, given that X, = x,? 
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5.2 Expected Values, Covariance, and Correlation 


Any function h(X) of a single rv X is itself a random variable. However, we saw that 
to compute E[h(X)], it is not necessary to obtain the probability distribution of h(X). 
Instead, E[h(X)] is computed as a weighted average of h(x) values, where the weight 
function is the pmf p(x) or pdf f(x) of X. A similar result holds for a function h(X, Y) 
of two jointly distributed random variables. 


PROPOSITION Let X and Y be jointly distributed rv’s with pmf p(x, y) or pdf f(x, y) according 
to whether the variables are discrete or continuous. Then the expected value of 
a function h(X, Y), denoted by E[h(X, Y)] or pj .x, y, 18 given by 


23 ~ h(x, y) + p@, y) if X and Y are discrete 


PX, =e es 
| | h(x, y) - f(x, y) dx dy if X and Y are continuous 


—oo 


EXAMPLE 5.13 Five friends have purchased tickets to a certain concert. If the tickets are for seats 
1-5 in a particular row and the tickets are randomly distributed among the five, what 
is the expected number of seats separating any particular two of the five? Let X and 
Y denote the seat numbers of the first and second individuals, respectively. Possible 
(X, Y) pairs are {(1, 2), (1, 3),..., (5, 4)}, and the joint pmf of (X, Y) is 


1 


p(x, y) = 4 20 
0 otherwise 


x=1,...,5;y=H1,...,5;x Fy 


The number of seats separating the two individuals is h(X, Y) = |X — Y| — 1. The 
accompanying table gives h(x, y) for each possible (x, y) pair. 


x 

h(x, y) 1 2 3 4 5 
1 — 0 1 2 3 
2 0 —_— 0 1 2 

y 3 1 0 — 0 1 
4 2 1 0 — 0 
5 3 2, 1 0 

Thus 


EX, Y= DAG») «pe =} Ve-y] - > 1 #- 
(9) x=1y=1 
x#y 


EXAMPLE 5.14 In Example 5.5, the joint pdf of the amount X of almonds and amount Y of cashews 
in a 1-Ib can of nuts was 
24xy OSxXS=L0Syslxt+y=l 
0 otherwise 


f@, y) = | 
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If 1 lb of almonds costs the company $1.50, 1 lb of cashews costs $2.25, and 1 lb of 
peanuts costs $.75, then the total cost of the contents of a can is 


A(X, Y) = (1.5)X + (2.25)¥ + (.75\1 — X — ¥) =.75 + .75X + 1.5Y 


(since | — X — Y of the weight consists of peanuts). The expected total cost is 


E{h(X, Y)] = | h(x, y) +f (&, y)dx dy 


—a J—« 


1 fl=x 
= | | (.75 + .75x + 1.5y) + 24xy dy dx = $1.65 fa 
0 


The method of computing the expected value of a function A(X,,..., X,,) of n 
random variables is similar to that for two random variables. If the X;’s are discrete, 
E[h(X,, ..., X,,)] is an n-dimensional sum; if the X,’s are continuous, it is an n- 
dimensional integral. 


Covariance 


When two random variables X and Y are not independent, it is frequently of interest 
to assess how strongly they are related to one another. 


DEFINITION The covariance between two rv’s X and Y is 
Cov(X, Y) = E[(X — py)(Y — py)] 


x 


DPIC Oe ccas) X, Y discrete 


| | (x — Py)(y — By) f(x, y)dx dy X, Y continuous 


—o0 


That is, since X — wy and Y — py are the deviations of the two variables from their 
respective mean values, the covariance is the expected product of deviations. Note 
that Cov(X, X) = E[(X — py)*] = V(X). 

The rationale for the definition is as follows. Suppose X and Y have a strong 
positive relationship to one another, by which we mean that large values of X 
tend to occur with large values of Y and small values of X with small values of 
Y. Then most of the probability mass or density will be associated with (x — py) 
and (y — py), either both positive (both X and Y above their respective means) or 
both negative, so the product (x — py)(y — py) will tend to be positive. Thus for 
a strong positive relationship, Cov(X, Y) should be quite positive. For a strong 
negative relationship, the signs of (x — pry) and (y — py) will tend to be opposite, 
yielding a negative product. Thus for a strong negative relationship, Cov(X, Y) 
should be quite negative. If X and Y are not strongly related, positive and nega- 
tive products will tend to cancel one another, yielding a covariance near 0. Figure 
5.4 illustrates the different possibilities. The covariance depends on both the set 
of possible pairs and the probabilities. In Figure 5.4, the probabilities could be 
changed without altering the set of possible pairs, and this could drastically change 
the value of Cov(X, Y). 
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EXAMPLE 5.15 


PROPOSITION 


EXAMPLE 5.16 
(Example 5.5 
continued) 
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eee a x > XxX > XxX 
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Figure 5.4 p(x, y) = 1/10 for each of ten pairs corresponding to indicated points: 
(a) positive covariance; (b) negative covariance; (c) covariance near zero 


The joint and marginal pmf’s for X = automobile policy deductible amount and 
Y = homeowner policy deductible amount in Example 5.1 were 


y 
pox, y) | 500 1000 5000 x | 100 500 1000 y | 500 1000 5000 
100 | 30 05 0 pa) | 35. 4025p) | 55350 
x 500 | 15 20  .05 
1000 | 10 10.05 


from which wy = =xpy(x) = 485 and wy = 1125. Therefore, 


Cov(X, Y) = 3) Sx — 485)(y — 1125)p(, y) 
(x, y) 


= (100 — 485)(500 — 1125)(.30) + ... 
+ (1000 — 485)(5000 — 1125)(.05) 
= 136,875 a 


The following shortcut formula for Cov(X, Y) simplifies the computations. 


Cov(X, Y) = E(XY) — py * by 


According to this formula, no intermediate subtractions are necessary; only at 
the end of the computation is fy - wy subtracted from E(XY). The proof involves 
expanding (X — wy)(Y — fy) and then carrying the summation or integration 
through to each individual term. 


The joint and marginal pdf’s of X = amount of almonds and Y = amount of cashews 
were 


24xy OSxS1,0SyFl,xt+y=l 
Sy) = . 
0 otherwise 
| Pxl=2 Ce¢21 
Fx) = | 0 otherwise 
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with f,(y) obtained by replacing x by y in f(x). It is easily verified that uy = by = 2, 
and 


1=x 


(oe) 0 1 
E(XY) = | | xy f(x, y) dx dy = | | xy + 24xy dy dx 
=o J 0 0 Jo 


: 2 
= s| x7(1 — x dx = — 
: 15 


Thus Cov(X, Y) = 2/15 — (2/5)(2/5) = 2/15 — 4/25 = —2/75. A negative covar- 
iance is reasonable here because more almonds in the can implies fewer cashews. Ml 


It might appear that the relationship in the insurance example is quite strong 
since Cov(X, Y) = 136,875, whereas Cov(X, Y) = —2/75 in the nut example would 
seem to imply quite a weak relationship. Unfortunately, the covariance has a seri- 
ous defect that makes it impossible to interpret a computed value. In the insurance 
example, suppose we had expressed the deductible amount in cents rather than in 
dollars. Then 100X would replace X, 1OOY would replace Y, and the resulting covari- 
ance would be Cov(100X, 100Y) = (100)(100)Cov(X, Y) = 1,368,750,000. If, on the 
other hand, the deductible amount had been expressed in hundreds of dollars, the 
computed covariance would have been (.01)(.01)(136,875) = 13.6875. The defect 
of covariance is that its computed value depends critically on the units of measure- 
ment. Ideally, the choice of units should have no effect on a measure of strength of 
relationship. This is achieved by scaling the covariance. 


Correlation 


DEFINITION The correlation coefficient of X and Y, denoted by Corr(X, Y), py y, or just p, 
is defined by 
_ Cov(X, Y) 


Px. y 
Ox * Oy 


EXAMPLE 5.17 _ It is easily verified that in the insurance scenario of Example 5.15, E(X”) = 353,500, 
oy = 353,500 — (485)? = 118,275, oy = 343.911, E(¥*) = 2,987,500, 04 = 
1,721,875, and oy = 1312.202. This gives 
_ 136.875 
(343,911)(1312.202) 


p = 303 ial 


The following proposition shows that p remedies the defect of Cov(X, Y) and also 
suggests how to recognize the existence of a strong (linear) relationship. 


PROPOSITION 1. If a and ¢ are either both positive or both negative, 
Corr(aX + b, cY + d) = Corr(X, Y) 


2. For any two rv’s X and Y, -1 <p <1. The two variables are said to be 
uncorrelated when p = 0. 
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Statement 1 says precisely that the correlation coefficient is not affected by a 
linear change in the units of measurement (if, say, X = temperature in °C, then 
9X/5 + 32 = temperature in °F). According to Statement 2, the strongest possible 
positive relationship is evidenced by p = +1, the strongest possible negative relation- 
ship corresponds to p =—1, and p = 0 indicates the absence of a relationship. The 
proof of the first statement is sketched in Exercise 35, and that of the second appears 
in Supplementary Exercise 87 at the end of the chapter. For descriptive purposes, the 
relationship will be described as strong if | p| = .8, moderate if 5 < |p| <.8, and 
weak if |p| <.5. 

If we think of p(x, y) or f(x, y) as prescribing a mathematical model for how 
the two numerical variables X and Y are distributed in some population (height and 
weight, verbal SAT score and quantitative SAT score, etc.), then p is a population 
characteristic or parameter that measures how strongly X and Y are related in the 
population. In Chapter 12, we will consider taking a sample of pairs (x,, y,), ..., 
(x,, y,) from the population. The sample correlation coefficient r will then be defined 
and used to make inferences about p. 

The correlation coefficient p is actually not a completely general measure of 
the strength of a relationship. 


PROPOSITION 1. If X and Y are independent, then p=0, but p =O does not imply 
independence. 


2. p = 1 or —1 iff Y = aX + b for some numbers a and b with a £ 0. 


This proposition says that p is a measure of the degree of linear relationship between 
X and Y, and only when the two variables are perfectly related in a linear manner will 
p be as positive or negative as it can be. However, if |p| << 1, there may still be a 
strong relationship between the two variables, just one that is not linear. And even 
if |p| is close to 1, it may be that the relationship is really nonlinear but can be well 
approximated by a straight line. 


EXAMPLE 5.18 Let X and Y be discrete rv’s with joint pmf 


es (x, y) = (-4, 1), (4,-D, (2, 2), (-2, -2) 

p(x, y) = ; 

0 otherwise 

The points that receive positive probability mass are identified on the (x, y) coordinate 
system in Figure 5.5. It is evident from the figure that the value of X is com- 
pletely determined by the value of Y and vice versa, so the two variables are com- 
pletely dependent. However, by symmetry wy = by = 0 and E(XY) = (—4)(.25) + 
(—4)(.25) + (4)(..25) + (4)(.25) =0. The covariance is then Cov(X,Y) = 
E(XY) — py * by = O and thus p, , = 0. Although there is perfect dependence, there is 
also complete absence of any linear relationship! 


25 e 
e 1- 
I T T T T T T 1 
-4 -3 -2 -1 1 2 3 
—] _ 
e = 2 
Figure 5.5 The population of pairs for Example 5.18 a 
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A value of p near | does not necessarily imply that increasing the value of X causes Y 
to increase. It implies only that large X values are associated with large Y values. For 
example, in the population of children, vocabulary size and number of cavities are 
quite positively correlated, but it is certainly not true that cavities cause vocabulary 
to grow. Instead, the values of both these variables tend to increase as the value of 
age, a third variable, increases. For children of a fixed age, there is probably a low 
correlation between number of cavities and vocabulary size. In summary, association 
(a high correlation) is not the same as causation. 


The Bivariate Normal Distribution 


Just as the most useful univariate distribution in statistical practice is the normal 
distribution, the most useful joint distribution for two rv’s X and Y is the bivariate 
normal distribution. The pdf is somewhat complicated: 


1 
270,0;V 1 — p? 


| 2(1 : all al re ( a Z| | ‘ “all 


A graph of this pdf, the density surface, appears in Figure 5.6. It follows (after 
some tricky integration) that the marginal distribution of X is normal with mean 
value mw, and standard deviation o, and similarly the marginal distribution of Y is 
normal with mean p, and standard deviation o,. The fifth parameter of the distri- 
bution is p, which can be shown to be the correlation coefficient between X and Y. 


f@, y) = 


exp 


fF y) 


ae A 
<= 


Figure 5.6 A graph of the bivariate normal pdf 


It is not at all straightforward to integrate the bivariate normal pdf in order to calcu- 
late probabilities. Instead, selected software packages employ numerical integration 
techniques for this purpose. 


EXAMPLE 5.19 Many students applying for college take the SAT, which for a few years consisted of 
three components: Critical Reading, Mathematics, and Writing. While some colleges 
used all three components to determine admission, many only looked at the first 
two (reading and math). Let X and Y denote the Critical Reading and Mathematics 
scores, respectively, for a randomly selected student. According to the College Board 
website, the population of students taking the exam in Fall 2012 had the following 
characteristics: uw, = 496, 0, = 114, w, = 514, 0, = 117. 

Suppose that X and Y have (approximately, since both variables are discrete) a 
bivariate normal distribution with correlation coefficient p = .25. The Matlab software 
package gives P(X = 650, Y = 650) = P(both scores are at most 650) = .8097. Mf 
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It can also be shown that the conditional distribution of Y given that X = x is 
normal. This can be seen geometrically by slicing the density surface with a plane 
perpendicular to the (x, y) passing through the value x on that axis; the result is 
a normal curve sketched out on the slicing plane. The conditional mean value is 
by., = (Ms — pe0,/0,) + pox/o,, a linear function of x, and the conditional vari- 
ance is o., = (1 — p*)o3. The closer the correlation coefficient is to 1 or —1, the 
less variability there is in the conditional distribution. Analogous results hold for the 
conditional distribution of X given that Y = y. 

The bivariate normal distribution can be generalized to the multivariate 
normal distribution. Its density function is quite complicated, and the only way 
to write it compactly is to employ matrix notation. If a collection of variables has 
this distribution, then the marginal distribution of any single variable is normal, the 
conditional distribution of any single variable given values of the other variables is 
normal, the joint marginal distribution of any pair of variables is bivariate normal, 
and the joint marginal distribution of any subset of three or more of the variables is 
again multivariate normal. 


EXERCISES Section 5.2 (22-36) 


22. An instructor has given a short quiz consisting of two both have length X and the east-west sides both have length 
parts. For a randomly selected student, let X = the Y. Suppose that X and Y are independent and that each is 
number of points earned on the first part and Y = the uniformly distributed on the interval [L — A, L + A] (where 
number of points earned on the second part. Suppose 0<A<L). What is the expected area of the resulting 
that the joint pmf of X and Y is given in the accompany- rectangle? 
ing table, 26. Consider a small ferry that can accommodate cars and 

y buses. The toll for cars is $3, and the toll for buses is $10. 
D(x, y) 0 5 10 15 Let X and Y denote the number of cars and buses, respec- 
0 02 06 02 10 tively, carried on a single trip. Suppose the joint distribu- 
4 5 04 ‘15 20 10 tion of X and Y is as given in the table of Exercise 7. 
10 Ol 15 414 0l Compute the expected revenue from a single trip. 
: ; 27. Annie and Alvie have agreed to meet for lunch between 
a. If the score recorded in the grade book is the total noon (0:00 p.m.) and 1:00 p.m. Denote Annie’s arrival 
number of points earned on the pe parts, what is the time by X, Alvie’s by Y, and suppose X and Y are inde- 
expected Teele score E(X + Y) j pendent with pdf's 
b. If the maximum of the two scores is recorded, what 
is the expected recorded score? fx) {e Vex=1 
Ix) = : 

23. The difference between the number of customers in line 0 otherwise 
at the express checkout and the number in line at the y O<y=1 
super-express checkout in Exercise 3 is X, — X). fy) = 0 ee 
Calculate the expected difference. Ren 

24. Six individuals, including A and B, take seats around a What is the expected abrouns of time that the one who 
circular table in a completely random fashion. Suppose arrives first must wait for the other person? [Hint: 
the seats are numbered 1, ..., 6. Let X = A’s seat num- h(x, Y) = | x ~ Y| J 
ber and Y = B’s seat number. If A sends a written mes- 28. Show that if X and Y are independent rv’s, then 
sage around the table to B in the direction in which they E(XY) = E(X)+ E(Y). Then apply this in Exercise 25. 
are closest, how many individuals (including A and B) [Hint: Consider the continuous case with f(x, y) = 
would you expect to handle the message? fx) fr)-] 

25. A surveyor wishes to lay out a square region with each side 29. Compute the correlation coefficient p for X and Y of 


having length L. However, because of a measurement error, 
he instead lays out a rectangle in which the north-south sides 


Example 5.16 (the covariance has already been 
computed). 
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30. a. Compute the covariance for X and Y in Exercise 22. variables. [Hint: Remember that variance is just a 
b. Compute p for X and Y in the same exercise. special expected value.] 
31. a. Compute the covariance between X and Y in Exercise 9. b. Use ee nana oe ane . venance . a 
b. Compute the correlation coefficient p for this X and Y. cna a ig I = sean 27) pare (aye 
Exercise 22. 
32. R ider th ini t t lifeti X 
ee ee 35. a. Use the rules of expected value to show that 


and Y as described in Exercise 12. Determine E(XY). 
What can be said about Cov(X, Y) and p? 


33. Use the result of Exercise 28 to show that when X and Y 
are independent, Cov(X, Y) = Corr(X, Y) = 0. 


34. a. 


Cov(aX + b, cY + d) = ac Cov(X, Y). 

b. Use part (a) along with the rules of variance and stan- 
dard deviation to show that Corr(aX + b, 
cY + d) = Corr(X, Y) when a and c have the same sign. 

Recalling the definition of o? for a single rv X, write c. 


a formula that would be appropriate for computing 36. Show that if Y = aX + b (a XO), then Corr(X, Y) =+1 
the variance of a function h(X, Y) of two random or —1. Under what conditions will p =+1? 


5.3 Statistics and Their Distributions 


The observations in a single sample were denoted in Chapter | by x,, x,,..., x, 
Consider selecting two different samples of size n from the same population distri- 
bution. The x,’s in the second sample will virtually always differ at least a bit from 
those in the first sample. For example, a first sample of n = 3 cars of a particular 
type might result in fuel efficiencies x, = 30.7, x, = 29.4, x, = 31.1, whereas a 
second sample may give x, = 28.8, x, = 30.0, and x, = 32.5. Before we obtain data, 
there is uncertainty about the value of each x, Because of this uncertainty, before 
the data becomes available we now regard each observation as a random variable 
and denote the sample by X,, X,, .. . , X,, (uppercase letters for random variables). 

This variation in observed values in turn implies that the value of any function 
of the sample observations—such as the sample mean, sample standard deviation, or 
sample fourth spread—also varies from sample to sample. That is, prior to obtaining 
, X,, there is uncertainty as to the value of x, the value of s, and so on. 


What happens if a and c have opposite signs? 


Mig ine 


EXAMPLE 5.20 Suppose that material strength for a randomly selected specimen of a particular 


type has a Weibull distribution with parameter values a = 2 (shape) and B = 5 
(scale). The corresponding density curve is shown in Figure 5.7. Formulas from 
Section 4.5 give 


w= E(X) = 4.4311 f= 4.1628 o% = V(X) =5.365 o = 2.316 


The mean exceeds the median because of the distribution’s positive skew. 


SQ) 
15 


.10 


.05 


0 5 10 15 


Figure 5.7 The Weibull density curve for Example 5.20 
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We used statistical software to generate six different samples, each with n = 10, from 
this distribution (material strengths for six different groups of ten specimens each). The 
results appear in Table 5.1, followed by the values of the sample mean, sample median, 
and sample standard deviation for each sample. Notice first that the ten observations 
in any particular sample are all different from those in any other sample. Second, the 
six values of the sample mean are all different from one another, as are the six values 
of the sample median and the six values of the sample standard deviation. The same is 
true of the sample 10% trimmed means, sample fourth spreads, and so on. 


Table 5.1 Samples from the Weibull Distribution of Example 5.20 


Sample 1 2 3 4 5 6 

1 6.1171 5.07611 3.46710 1.55601 3.12372 8.93795 
2 4.1600 6.79279 2.71938 4.56941 6.09685 3.92487 
3 3.1950 4.43259 5.88129 4.79870 3.41181 8.76202 
+ 0.6694 8.55752 5.14915 2.49759 1.65409 7.05569 
5 1.8552 6.82487 4.99635 2.33267 2.29512 2.30932 
6 5.2316 7.39958 5.86887 4.01295 2.12583 5.94195 
7 2.7609 2.14755 6.05918 9.08845 3.20938 6.74166 
8 10.2185 8.50628 1.80119 3.25728 3.23209 1.75468 
9 5.2438 5.49510 4.21994 3.70132 6.84426 4.91827 

10 4.5590 4.04525 2.12934 5.50134 4.20694 7.26081 
x 4.401 5.928 4.229 4.132 3.620 5.761 

x 4.360 6.144 4.608 3.857 3.221 6.342 

Ss 2.642 2.062 1.611 2.124 1.678 2.496 


Furthermore, the value of the sample mean from any particular sample can be 
regarded as a point estimate (“point” because it is a single number, corresponding to 
a single point on the number line) of the population mean pz, whose value is known 
to be 4.4311. None of the estimates from these six samples is identical to what is 
being estimated. The estimates from the second and sixth samples are much too 
large, whereas the fifth sample gives a substantial underestimate. Similarly, the sam- 
ple standard deviation gives a point estimate of the population standard deviation. 
All six of the resulting estimates are in error by at least a small amount. 

In summary, the values of the individual sample observations vary from sample 
to sample, so will in general the value of any quantity computed from sample data, and 
the value of a sample characteristic used as an estimate of the corresponding popula- 
tion characteristic will virtually never coincide with what is being estimated. iz 


DEFINITION A statistic is any quantity whose value can be calculated from sample data. 
Prior to obtaining data, there is uncertainty as to what value of any particular 
statistic will result. Therefore, a statistic is a random variable and will be 
denoted by an uppercase letter; a lowercase letter is used to represent the 
calculated or observed value of the statistic. 


Thus the sample mean, regarded as a statistic (before a sample has been selected 
or an experiment carried out), is denoted by X; the calculated value of this statistic 
is x. Similarly, S represents the sample standard deviation thought of as a statistic, 
and its computed value is s. If samples of two different types of bricks are selected 
and the individual compressive strengths are denoted by X,,..., X,, and Y,,..., Y, 


m n? 
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respectively, then the statistic X — Y, the difference between the two sample mean 
compressive strengths, is often of great interest. 

Any statistic, being a random variable, has a probability distribution. In par- 
ticular, the sample mean X has a probability distribution. Suppose, for example, that 
n = 2 components are randomly selected and the number of breakdowns while 
under warranty is determined for each one. Possible values for the sample mean 
number of breakdowns X are 0 (if X, = X, = 0), .5 (if either X, = 0 and X, = 1 or 
X, = land X, = 0), 1, 1.5,....The probability distribution of xX specifies P(X = 0), 
P(X = .5), and so on, from which other probabilities such as PU = X = 3) and 
P(X = 2.5) can be calculated. Similarly, if for a sample of size n = 2, the only pos- 
sible values of the sample variance are 0, 12.5, and 50 (which is the case if X, and 
X, can each take on only the values 40, 45, or 50), then the probability distribution 
of S? gives P(S? = 0), P(S? = 12.5), and P(S? = 50). The probability distribution of 
a statistic is sometimes referred to as its sampling distribution to emphasize that it 
describes how the statistic varies in value across all samples that might be selected. 


Random Samples 


The probability distribution of any particular statistic depends not only on the 
population distribution (normal, uniform, etc.) and the sample size n but also on the 
method of sampling. Consider selecting a sample of size n = 2 from a population 
consisting of just the three values 1, 5, and 10, and suppose that the statistic of inter- 
est is the sample variance. If sampling is done “with replacement,’ then S* = 0 will 
result if X, = X,. However, S? cannot equal 0 if sampling is “without replacement.” 
So P(S? = 0) = 0 for one sampling method, and this probability is positive for the 
other method. Our next definition describes a sampling method often encountered 
(at least approximately) in practice. 


DEFINITION The rv’s X,, X,,..., X,, are said to form a (simple) random sample of size n if 


1. The X;’s are independent rv’s. 
2. Every X; has the same probability distribution. 


Conditions | and 2 can be paraphrased by saying that the X;’s are independent and 
identically distributed (iid). If sampling is either with replacement or from an infinite 
(conceptual) population, Conditions | and 2 are satisfied exactly. These conditions 
will be approximately satisfied if sampling is without replacement, yet the sample 
size n is much smaller than the population size N. In practice, if n/N S .05 (at most 
5% of the population is sampled), we can proceed as if the X;’s form a random 
sample. The virtue of such random sampling is that the probability distribution of 
any statistic can be more easily obtained than for any other sampling procedure. 

There are two general methods for obtaining information about a statistic’s 
sampling distribution. One method involves calculations based on probability rules, 
and the other involves carrying out a simulation experiment. 


Deriving a Sampling Distribution 


Probability rules can be used to obtain the distribution of a statistic provided that it 
is a “fairly simple” function of the X;’s and either there are relatively few different 
X values in the population or else the population distribution has a “nice” form. Our 
next two examples illustrate such situations. 
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EXAMPLE 5.21 A certain brand of MP3 player comes in three configurations: a model with 2 GB of 
memory, costing $80, a 4 GB model priced at $100, and an 8 GB version with a price 
tag of $120. If 20% of all purchasers choose the 2 GB model, 30% choose the 4 GB 
model, and 50% choose the 8 GB model, then the probability distribution of the cost 
X of a single randomly selected MP3 player purchase is given by 


x | 80 100 120 


with . = 106, o2 = 244 (5.2) 
py 12 3 5 


Suppose on a particular day only two MP3 players are sold. Let X, = the revenue 
from the first sale and X, = the revenue from the second. Suppose that X, and X, 
are independent, each with the probability distribution shown in (5.2) [so that X, and 
X, constitute a random sample from the distribution (5.2)]. Table 5.2 lists possible 
(x,, X,) pairs, the probability of each [computed using (5.2) and also the assumption 
of independence], and the resulting x and s” values. [Note that when n = 2, s? = 
(x, — x)? + (x, — x)*.] Now to obtain the probability distribution of X, the sample 
average revenue per sale, we must consider each possible value x and compute its 
probability. For example, x = 100 occurs three times in the table with probabilities 
.10, .09, and .10, so 


px(100) = P(X = 100) = .10 + .09 + .10 = .29 
Similarly, 


ps(800) = P(S? = 800) = P(X, = 80, X, = 120 or X, = 120, X, = 80) 
= 10+ .10 = .20 


Table 5.2 Outcomes, Probabilities, and Values of X and 


s for Example 5.21 
x a) P(X X>) x x 
80 80 04 80 0 
80 100 .06 90 200 
80 120 10 100 800. 
100 80 .06 90 200 
100 100 .09 100 0 
100 120 5 110 200 
120 80 10 100 800 
120 100 15 110 200 
120 120 25 120 0 


The complete sampling distributions of X and S? appear in (5.3) and (5.4). 


X | 80 90 100 110 ~~ 120 as 
pe) 1.0412 29 «30.25 

2 | 0 200-800 (5.4) 
pels?) | 38 42 20 


Figure 5.8 pictures a probability histogram for both the original distribution (5.2) 
and the X distribution (5.3). The figure suggests first that the mean (expected value) 
of the X distribution is equal to the mean 106 of the original distribution, since both 
histograms appear to be centered at the same place. 
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25 


80 100 120 80 90 100 110 120 
Figure 5.8 Probability histograms for the underlying distribution and X distribution in Example 5.21 
From (5.3), 
by = E(X) = =xpz%) = (80)(.04) + - -- + (120)(.25) = 106 = pw 


Second, it appears that the X distribution has smaller spread (variability) than the origi- 
nal distribution, since probability mass has moved in toward the mean. Again from (5.3), 


o2 = VX) = 3x? + p,@) - w2 
= (802)(.04) +-+--+(1202)(.25) — (106)? 
122 = 244/2 = 02/2 


The variance of X is precisely half that of the original variance (because n = 2). 
Using (5.4), the mean value of S? is 


bs. = E(S?) = S)s? - p(s’) 
= (0)(.38) + (200)(.42) + (800)(.20) = 244 = o° 


That is, the X sampling distribution is centered at the population mean yz, and the S? 
sampling distribution is centered at the population variance o?. 

If there had been four purchases on the day of interest, the sample average rev- 
enue X would be based on a random sample of four X;’s, each having the distribution 
(5.2). Mildly tedious calculations yield the pmf of X for n = 4 as 


x | 80 85 90 95 100 105 110 115 120 


PxX) | 0016 .0096 = .0376 0936 =.1761 = .2340 ~~ .2350 1500 .0625 


From this, wy = 106 = w and of = 61 = o°/4. Figure 5.9 is a probability histo- 
gram of this pmf. 


= 
80 90 100 110 120 
Figure 5.9 Probability histogram for X based on n = 4 in Example 5.21 


Example 5.21 should suggest first of all that the computation of py(x) and 
Ps:(s”) can be tedious. If the original distribution (5.2) had allowed for more than 
three possible values, then even for n = 2 the computations would have been more 
involved. The example should also suggest, however, that there are some general 
relationships between E(X), V(X), E(S?), and the mean and variance o* of the 
original distribution. These are stated in the next section. Now consider an example 
in which the random sample is drawn from a continuous distribution. 
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EXAMPLE 5.22 Service time for a certain type of bank transaction is a random variable having an 
exponential distribution with parameter A. Suppose X, and X, are service times for two 
different customers, assumed independent of each other. Consider the total service 
time T, = X, + X, for the two customers, also a statistic. The cdf of T, is, for t = 0, 


F,() = P(X, + XS) = || F(X, Xp) dx, dx, 


{ (4), X%)! X) +x =F} 


t ft—x, t 
| | he » Ne* dx, dx, = | [Ae** — Xe] dx, 
0/0 0 


=l—-—e”"-—)te™ 


The region of integration is pictured in Figure 5.10. 


Figure 5.10 Region of integration to obtain cdf of 7, in Example 5.22 


The pdf of T, is obtained by differentiating F;, (7): 


Mte™ t=O 


j= Be) 
fr(t) | = (5.5) 
This is a gamma pdf (a = 2 and B = I/A). The pdf of X = T,/2 is obtained from the 
relation {X = x} iff {T, S 2x} as 


4)? xe AX x= 0 (5.6) 
0 x 


The mean and variance of the underlying exponential distribution are 4 = 1/A and 
o* = 1/A*. From Expressions (5.5) and (5.6), it can be verified that E(X) = 1/A, 
V(X) = 1/(2A2), E(T,) = 2/A, and V(T,) = 2/A2. These results again suggest some 
general relationships between means and variances of X, T,, and the underlying 
distribution. a 


Simulation Experiments 


The second method for obtaining information about a statistic’s sampling distribu- 
tion is to perform a simulation experiment. This method is usually used when a deri- 
vation via probability rules is very difficult or even impossible. Such an experiment 
is virtually always done with the aid of a computer. The following characteristics of 
an experiment must be specified: 


1. The statistic of interest (X, S, a particular trimmed mean, etc.) 


2. The population distribution (normal with w = 100 and o = 15, uniform with 
lower limit A = 5 and upper limit B = 10, etc.) 


3. The sample size n (e.g., 2 = 10 orn = 50) 


4. The number of replications k (number of samples to be obtained) 
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Then use appropriate software to obtain k different random samples, each of size 
n, from the designated population distribution. For each sample, calculate the 
value of the statistic and construct a histogram of the k values. This histogram 
gives the approximate sampling distribution of the statistic. The larger the value 
of k, the better the approximation will tend to be (the actual sampling distribution 
emerges as k — ©). In practice, k = 500 or 1000 is usually sufficient if the statistic 
is “fairly simple.” 


EXAMPLE 5.23 The population distribution for our first simulation study is normal with w = 8.25 
and o = .75, as pictured in Figure 5.11. [The article “‘Platelet Size in Myocardial 
Infarction” (British Med. J., 1983: 449-451) suggests this distribution for platelet 
volume in individuals with no history of serious heart problems. ] 


T T T T 
6.00 6.75 7.50 | 9.00 9.75 10.50 
b= 8.25 


Figure 5.11 Normal distribution, with x = 8.25 and 0 = .75 


We actually performed four different experiments, with 500 replications for each 
one. In the first experiment, 500 samples of n = 5 observations each were generated 
using Minitab, and the sample sizes for the other three were n = 10, n = 20, and 
n = 30, respectively. The sample mean was calculated for each sample, and the 
resulting histograms of x values appear in Figure 5.12. 


Relative Relative 
frequency frequency 
A A 
25 5 25 5 
20 7 .20 7 
A534 15 5 
10 5 10 5 
05 5 05 5 
aT a a oe _ ~ x x 
7.35 7.65 7.95 8.25 8.55 8.85 9.15 7.50 7.80 8.10 8.40 8.70 
7.50 7.80 8.10 8.40 8.70 9.00 9.30 7.65 7.95 8.25 8.55 8.85 
(a) (b) 


Figure 5.12 Sample histograms for X based on 500 samples, each consisting of n observations: 
(a) n = 5; (b) n = 10; (c) n = 20; (d) n = 30 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


5.3 Statistics and Their Distributions 227 


Relative Relative 
frequency frequency 
rN A 

2 >| 25 5 

.20 4 .20 5 

AS 4 a 135 

10 - 104 

05 > 05 5 
eee - x x 
7.80 8.10 8.40 8.70 7.80 8.10 8.40 8.70 

7.95 8.25 8.55 7.95 825 855 
(c) (d) 


Figure 5.12 (continued) 


The first thing to notice about the histograms is their shape. To a reasonable 
approximation, each of the four looks like a normal curve. The resemblance would 
be even more striking if each histogram had been based on many more than 500 x 
values. Second, each histogram is centered approximately at 8.25, the mean of the 
population being sampled. Had the histograms been based on an unending sequence 
of x values, their centers would have been exactly the population mean, 8.25. 

The final aspect of the histograms to note is their spread relative to one 
another. The larger the value of n, the more concentrated is the sampling distribu- 
tion about the mean value. This is why the histograms for n = 20 and n = 30 are 
based on narrower class intervals than those for the two smaller sample sizes. For 
the larger sample sizes, most of the x values are quite close to 8.25. This is the effect 
of averaging. When 7 is small, a single unusual x value can result in an x value far 
from the center. With a larger sample size, any unusual x values, when averaged in 
with the other sample values, still tend to yield an x value close to w. Combining 
these insights yields a result that should appeal to your intuition: X based on a large 
n tends to be closer to yz than does X based on a small n. a 


EXAMPLE 5.24 Consider a simulation experiment in which the population distribution is quite 
skewed. Figure 5.13 shows the density curve for lifetimes of a certain type of 


Se) 
05 


.04 
03 
02 


.O1 


0 25 50 75 


Figure 5.13 Density curve for the simulation experiment of Example 5.24 [E(X) = 21.7584, 
UX) = 82.1449] 
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electronic control [this is actually a lognormal distribution with E(In(X)) = 3 and 
Vdn(X)) = .16]. Again the statistic of interest is the sample mean X. The experiment 
utilized 500 replications and considered the same four sample sizes as in Exam- 
ple 5.23. The resulting histograms along with a normal probability plot from Minitab 
for the 500 x values based on n = 30 are shown in Figure 5.14. 


Density Density 
10 10 
n=5 n=10 
.05 .05 
(0) x 0 x 
10 20 30 40 10 20 30 40 
(a) (b) 
Density Density 


£ 80 
@ 50 
2 
2 20 
ou. 
05 
.01 
001 
18 19 20 21 22 23 24 25 26 27 
mean30 
Average: 21.7891 W-test for Normality 
StDev: 1.57396 R: 0.9975 
N: 500 P-Value (approx): > 0.1000 
(e) 


Figure 5.14 Results of the simulation experiment of Example 5.24: (a) X histogram for 
n=5; (b) X histogram for n = 10; (c) X histogram for n = 20; (d) X histogram for n = 30; 
(e) normal probability plot for n = 30 (from Minitab) 
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Unlike the normal case, these histograms all differ in shape. In particular, they 
become progressively less skewed as the sample size n increases. The average of 
the 500 x values for the four different sample sizes are all quite close to the mean 
value of the population distribution. If each histogram had been based on an unend- 
ing sequence of x values rather than just 500, all four would have been centered at 
exactly 21.7584. Thus different values of n change the shape but not the center of 
the sampling distribution of X. Comparison of the four histograms in Figure 5.14 
also shows that as n increases, the spread of the histograms decreases. Increasing 
n results in a greater degree of concentration about the population mean value and 
makes the histogram look more like a normal curve. The histogram of Figure 5.14(d) 
and the normal probability plot in Figure 5.14(e) provide convincing evidence that 
a sample size of n = 30 is sufficient to overcome the skewness of the population 
distribution and give an approximately normal X sampling distribution. a 


EXERCISES Section 5.3 (37—45) 


37. A particular brand of dishwasher soap is sold in three period (are “‘successes”). Suppose that n = 15 drives are 
sizes: 25 0z, 40 oz, and 65 oz. Twenty percent of all pur- randomly selected. Let X = the number of successes in 
chasers select a 25-0z box, 50% select a 40-0z box, and the the sample. The statistic X/n is the sample proportion 
remaining 30% choose a 65-oz box. Let X, and X, denote (fraction) of successes. Obtain the sampling distribution of 
the package sizes selected by two independently selected this statistic. [Hint: One possible value of X/n is .2, corre- 
purchasers. sponding to X = 3. What is the probability of this value 
a. Determine the sampling distribution of X, calculate (what kind of rv is X)?] 

E(X), and compare to 1. 40. A box contains ten sealed envelopes numbered 1, ... , 10. 
b. Determine the sampling distribution of the sample The first five contain no money, the next three each con- 
variance S*, calculate E(S*), and compare to 0°. tains $5, and there is a $10 bill in each of the last two. A 

38. There are two traffic lights on a commuter’s route to and sample of size 3 is selected with replacement (so we have 
from work. Let X, be the number of lights at which the a random sample), and you get the largest amount in any 
commuter must stop on his way to work, and X, be the of the envelopes selected. If X,, X,, and X, denote the 
number of lights at which he must stop when returning amounts in the selected envelopes, the statistic of interest 
from work. Suppose these two variables are independent, is M = the maximum of X,, X,, and X3. 
each with pmf given in the accompanying table (so X,, X, a. Obtain the probability distribution of this statistic. 
is arandom sample of size n = 2). b. Describe how you would carry out a simulation 

experiment to compare the distributions of M for 
a | 6 I 2 p= 11,07 = 49 various sample sizes. How would you guess the dis- 
P(X) Le 5 3 tribution would change as n increases? 
a. Determine the pmf of T, = X, + X. 41. Let X be the number of packages being mailed by a ran- 
b. Calculate .,. How does it relate to 2, the population domly selected customer at a certain shipping facility. 
niean? Suppose the distribution of X is as follows: 
ce. Calculate o7,. How does it relate to 0°, the popula- x [i 2 3 24 
tion variance? . eee | A 3 2 4 
d. Let X, and X, be the number of lights at which a stop 
is required when driving to and from work on a second a. Consider a random sample of size n = 2 (two cus- 
day assumed independent of the first day. With tomers), and let X be the sample mean number of 
T, = the sum of all four X;’s, what now are the values packages shipped. Obtain the probability distribution 
of E(T,) and V(T,)? of X. 
e. Referring back to (d), what are the values of Refer to part (a) and calculate P(X < 2.5). 
P(T, = 8) and P(T, = 7) [Hint: Don’t even think of c. Again consider a random sample of size n = 2, but 
listing all possible outcomes!] now focus on the statistic R = the sample range (dif- 
39. It is known that 80% of all brand A external hard drives ference between the largest and smallest values in the 


work in a satisfactory manner throughout the warranty 


sample). Obtain the distribution of R. [Hint: Calculate 
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the value of R for each outcome and use the probabili- 43. Suppose the amount of liquid dispensed by a certain 
ties from part (a).] machine is uniformly distributed with lower limit A = 8 oz 
d. If arandom sample of size n = 4 is selected, what and upper limit B = 10 oz. Describe how you would carry 
is P(X = 1.5)? [Hint: You should not have to list all out simulation experiments to compare the sampling dis- 
possible outcomes, only those for which x = 1.5.] tribution of the (sample) fourth spread for sample sizes 
42. A company maintains three offices in a certain region, = Sue eneane 0, 
each staffed by two employees. Information concerning 44. Carry out a simulation experiment using a statistical 
yearly salaries (1000s of dollars) is as follows: computer package or « other software to study the sam- 
pling distribution of X when the population distribution 
Office 1 1 2 2 3 3 is Weibull with a = 2 and B = 5S, as in Example 5.20. 
Employee 1 2 3 4 5 6 Consider the four sample sizes n = 5, 10, 20, and 30, and 
Salary 29.7 33.6 30.2 33.6 25.8 29.7 in each case use 1000 replications. For which of these 
a. Suppose two of these employees are randomly sample sizes does the X oe distribution appear to 
selected from among the six (without replacement). Peapproniediely anneal 
Determine the sampling distribution of the sample 45. Carry out a simulation experiment using a statistical 


mean salary X. 


computer package or other software to study the sam- 


pling distribution of X when the population distribu- 
tion is lognormal with E(In(X)) = 3 and VUin(X)) = 1. 
Consider the four sample sizes n = 10, 20, 30, and 50, 
and in each case use 1000 replications. For which of 
these sample sizes does the X sampling distribution 
appear to be approximately normal? 


5.4 The Distribution of the Sample Mean 


The importance of the sample mean X springs from its use in drawing conclusions 
about the population mean jz. Some of the most frequently used inferential procedures 
are based on properties of the sampling distribution of X. A preview of these proper- 
ties appeared in the calculations and simulation experiments of the previous section, 
where we noted relationships between E(X) and p and also among V(X), o?, and n. 


b. Suppose one of the three offices is randomly selected. 
Let X, and X, denote the salaries of the two employ- 
ees. Determine the sampling distribution of X. 

c. How does E(X) from parts (a) and (b) compare to the 
population mean salary 2? 


PROPOSITION Let X,, X,,..., X, be arandom sample from a distribution with mean value pw 


and standard deviation o. Then 
1. EX) =py = p 
2. VQ) =02=¢7/nand co =a/ Vn 


In addition, with 7, = X, + 
V(T,) = no®, and 07, = Vno. 


--- +X, (the sample total), E(T,) = np, 


Proofs of these results are deferred to the next section. According to Result 1, the 
sampling (i.e., probability) distribution of X is centered precisely at the mean of the 
population from which the sample has been selected. Result 2 shows that the X distri- 
bution becomes more concentrated about pz as the sample size n increases. In marked 
contrast, the distribution of JT, becomes more spread out as n increases. Averaging 
moves probability in toward the middle, whereas totaling spreads probability out 
over a wider and wider range of values. The standard deviation c= a/ Vn is often 
called the standard error of the mean; it describes the magnitude of a typical or 
representative deviation of the sample mean from the population mean. 
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EXAMPLE 5.25 Ina notched tensile fatigue test on a titanium specimen, the expected number of 
cycles to first acoustic emission (used to indicate crack initiation) is w = 28,000, 
and the standard deviation of the number of cycles is @ = 5000. Let X,, X,,..., X55 
be a random sample of size 25, where each X; is the number of cycles on a different 
randomly selected specimen. Then the expected value of the sample mean number of 
cycles until first emission is E(X) = w = 28,000, and the expected total number of 
cycles for the 25 specimens is E(T)) = nu = 25(28,000) = 700,000. The standard 
deviation of X (standard error of the mean) and of Ty are 


5000 
oy = 0/Vn = —— = 1000 
. V25 


or = Vno = V/25(5000) = 25,000 


If the sample size increases to n = 100, E(X) is unchanged, but oy = 500, half of its 
previous value (the sample size must be quadrupled to halve the standard deviation 
of X). | 


The Case of a Normal Population 
Distribution 
The simulation experiment of Example 5.23 indicated that when the population 


distribution is normal, a histogram of x values for any sample size n is well approxi- 
mated by a normal curve. 


PROPOSITION Let X,, X,,..., X, be a random sample from a normal distribution with mean 
and standard deviation o. Then for any n, X is normally distributed (with 
mean p and standard deviation a/ Vn), as is T, (with mean np and standard 
deviation Vno).* 


We know everything there is to know about the X and 7, distributions when the 
population distribution is normal. In particular, probabilities such as P(a = X < b) 
and P(c = T, = d) can be obtained simply by standardizing. Figure 5.15 illustrates 
the X part of the proposition. 


X distribution when n = 10 


X distribution when n = 4 


Population distribution 


Figure 5.15 A normal population distribution and X sampling distributions 


* A proof of the result for 7, when n = 2 is possible using the method in Example 5.22, but the details are 
messy. The general result is usually proved using a theoretical tool called a moment generating function. 
One of the chapter references can be consulted for more information. 
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EXAMPLE 5.26 The distribution of egg weights (g) of a certain type is normal with mean value 53 
and standard deviation .3 (consistent with data in the article ‘Evaluation of Egg 
Quality Traits of Chickens Reared under Backyard System in Western Uttar 
Pradesh” (Indian J. of Poultry Sci., 2009: 261—262)). Let X,, X,, ... , X,. denote the 
weights of a dozen randomly selected eggs; these X,’s constitute a random sample of 
size 12 from the specified normal distribution. 
The total weight of the 12 eggs is T, = X, + ... + X,,; it is normally distributed 
with mean value E(T,) = nu = 12(53) = 636 and variance V(T,) = no* =12(.3P = 
1.08. The probability that the total weight is between 635 and 640 is now obtained by 
standardizing and referring to Appendix Table A.3: 


635 — 636 2ge 640 — 636 
V 1.08 V 1.08 
= 0(3.85) — B(—.96) ~ 1 — .1685 = .8315 


If cartons containing a dozen eggs are repeatedly selected, in the long run slightly 
more than 83% of the eggs in a carton will weigh in total between 635 g and 640 g. 
Notice that 635 < T, < 640 is equivalent to 52.9167 < X < 53.3333 (divide each 
term in the original system of inequalities by 12). Thus P(52.9167 < X < 53.3333) ~ 
.8315. This latter probability can also be obtained by standardizing X directly. 

Now consider randomly selecting just four of these eggs. The sample mean 
weight X is then normally distributed with mean value by = w= 53 and stand- 
ard deviation oz = o/Vn = .3/V4 = .15. The probability that the sample mean 
weight exceeds 53.5 g is then 


P(635 < T, < 640) = = P(-.96 < Z< 3.85) 


7 53.5 — 53 
P(X > 53.5) = p(z > se) 


15 
= P(Z > 3.33) = 1 — ©B.33) = 1 — .9996 = .0004 


Because 53.5 is 3.33 standard deviations (of X ) larger than the mean value 53, it is 
exceedingly unlikely that the sample mean will exceed 53.5. ee 


The Central Limit Theorem 


When the X,’s are normally distributed, so is X for every sample size n. The deriva- 
tions in Example 5.21 and simulation experiment of Example 5.24 suggest that even 
when the population distribution is highly nonnormal, averaging produces a distri- 
bution more bell-shaped than the one being sampled. A reasonable conjecture is 
that if n is large, a suitable normal curve will approximate the actual distribution of 
X. The formal statement of this result is the most important theorem of probability. 


THEOREM The Central Limit Theorem (CLT) 


Let X,, X,,..., X, be a random sample from a distribution with mean mw and 
variance o*. Then if n is sufficiently large, X has approximately a normal dis- 
tribution with wy = w and o% = o7/n, and T, also has approximately a normal 
distribution with 4, = np, 07 = no”. The larger the value of n, the better the 
approximation. 


Figure 5.16 illustrates the Central Limit Theorem. According to the CLT, when n is 
large and we wish to calculate a probability such as P(a = X = b), we need only 
“pretend” that X is normal, standardize it, and use the normal table. The resulting 
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X distribution for 
large n (approximately normal) 
X distribution for 
small to moderate n 


Population a 
distribution 


w 


Figure 5.16 The Central Limit Theorem illustrated 


answer will be approximately correct. The exact answer could be obtained only by 
first finding the distribution of X, so the CLT provides a truly impressive shortcut. 
The proof of the theorem involves much advanced mathematics. 


EXAMPLE 5.27 The amount of a particular impurity in a batch of a certain chemical product is a 
random variable with mean value 4.0 g and standard deviation 1.5 g. If 50 batches 
are independently prepared, what is the (approximate) probability that the sample 
average amount of impurity X is between 3.5 and 3.8 g? According to the rule of 
thumb to be stated shortly, n = 50 is large enough for the CLT to be applicable. 
X then has approximately a normal distribution with mean value Hy = 4.0 and 
oy = 1.5/V'50 = .2121, so 


= 5-4, 8-4. 
pas sX=38)~ 725 — 40 <7 = 38-20) 


Qigt ~~ 2004 
= @(—.94) — O(—2.36) = .1645 


Now consider randomly selecting 100 batches, and let T, represent the total amount 
of impurity in these batches. Then the mean value and standard deviation of T, are 
100(4) = 400 and V 100(1.5) = 15, respectively, and the CLT implies that 7, has 
approximately a normal distribution. The probability that this total is at most 425 g is 


425 — 400 
= — nr, 


P(Ty = 425) = P|Z 
(Tp ) 15 


= P(Z = 1.67) = B(1.67) = .9525 a 


EXAMPLE 5.28 Let X = the number of different people sent text messages during a particular day by 
a randomly selected student at a large university. Suppose the mean value of X is 7 
and the standard deviation is 6 (values very close to those reported in the article ‘Cell 
Phone Use and Grade Point Average Among Undergraduate University Students” 
(College Student J., 2011: 544-551). Among 100 randomly selected such students, 
how likely is it that the sample mean number of different people texted exceeds 5? 
Notice that the distribution being sampled is discrete, but the CLT is applicable whether 
the variable of interest is discrete or continuous. Also, although the fact that the stand- 
ard deviation of this nonnegative variable is quite large relative to the mean value sug- 
gests that its distribution is positively skewed, the large sample size implies that X does 
have approximately a normal distribution. Using wz = 7 and ox = .6, 


= 5-7 
p> s)~e(2> 227 \—4 (—3.33) = .9996 


Note: The cited article stated that text messaging frequency was negatively corre- 
lated with GPA. a 
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EXAMPLE 5.29 


Probability 4 
0.16 5 


0.14 5 


0.12 5 


0.10 - 


0.08 4 


0.06 - 


0.04 + 


0.02 


The CLT provides insight into why many random variables have probability dis- 
tributions that are approximately normal. For example, the measurement error in a 
scientific experiment can be thought of as the sum of a number of underlying per- 
turbations and errors of small magnitude. 

A practical difficulty in applying the CLT is in knowing when n is sufficiently 
large. The problem is that the accuracy of the approximation for a particular n 
depends on the shape of the original underlying distribution being sampled. If the 
underlying distribution is close to a normal density curve, then the approximation 
will be good even for a small n, whereas if it is far from being normal, then a large 
n will be required. 


Rule of Thumb 


The Central Limit Theorem can generally be used if n > 30. 


There are population distributions for which even an n of 40 or 50 does not suffice, 
but such distributions are rarely encountered in practice. On the other hand, the rule 
of thumb is often conservative; for many population distributions, an n much less 
than 30 would suffice. For example, in the case of a uniform population distribution, 
the CLT gives a good approximation for n = 12. 


Consider the distribution shown in Figure 5.17 for the amount purchased (rounded 
to the nearest dollar) by a randomly selected customer at a particular gas station 
(a similar distribution for purchases in Britain (in £) appeared in the article ‘Data 
Mining for Fun and Profit,” Statistical Science, 2000: 111-131; there were big 
spikes at the values, 10, 15, 20, 25, and 30). The distribution is obviously quite 
non-normal. 

We asked Minitab to select 1000 different samples, each consisting of 
n = 15 observations, and calculate the value of the sample mean X for each one. 
Figure 5.18 is a histogram of the resulting 1000 values; this is the approximate 


0.00 et 


il r = | > Purchase amount 


10 15 20 25 30 35 40 45 50 55 60 


Figure 5.17 Probability distribution of X = amount of gasoline purchased ($) 
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Density 4 
0.14 5 


0.12 5 
0.10 5 
0.08 - 
0.06 4 
0.04 5 


0.00 -4 a > Mean 


T T T T T T T 
18 21 24 27 30 33 36 


Figure 5.18 Approximate sampling distribution of the sample mean amount purchased when 
n= 15 and the population distribution is as shown in Figure 5.17 


sampling distribution of X under the specified circumstances. This distribution 
is clearly approximately normal even though the sample size is actually much 
smaller than 30, our rule-of-thumb cutoff for invoking the Central Limit Theorem. 
As further evidence for normality, Figure 5.19 shows a normal probability plot 
of the 1000 x values; the linear pattern is very prominent. It is typically not non- 
normality in the central part of the population distribution that causes the CLT to 
fail, but instead very substantial skewness. 


99.99 


99 
95 


80 


50 


Percent 


15 20 25 30 35 40 
Mean 


Figure 5.19 Normal probability plot from Minitab of the 1000 X values based on samples of 
sizen= 15 a 


Other Applications of the Central 
Limit Theorem 


The CLT can be used to justify the normal approximation to the binomial distribu- 
tion discussed in Chapter 4. Recall that a binomial variable X is the number of suc- 
cesses in a binomial experiment consisting of n independent success/failure trials 
with p = P(S) for any particular trial. Define a new rv X, by 


x= 1 if the Ist trial results in a success 
: 0 if the Ist trial results in a failure 


and define X,, X3,..., X,, analogously for the other n — | trials. Each X; indicates 
whether or not there is a success on the corresponding trial. 
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Because the trials are independent and P(S) is constant from trial to trial, the 
X,’s are iid (a random sample from a Bernoulli distribution). The CLT then implies 
that if n is sufficiently large, both the sum and the average of the X,’s have approxi- 
mately normal distributions. When the X,’s are summed, a 1 is added for every S that 
occurs and a 0 for every F, so X, + --- + X, = X. The sample mean of the X,’s is 
X/n, the sample proportion of successes. That is, both X and X/n are approximately 
normal when zn is large. The necessary sample size for this approximation depends 
on the value of p: When p is close to .5, the distribution of each X;, is reasonably 
symmetric (see Figure 5.20), whereas the distribution is quite skewed when p is near 
0 or 1. Using the approximation only if both np = 10 and n(1 — p) = 10 ensures that 
nis large enough to overcome any skewness in the underlying Bernoulli distribution. 


(a) (b) 


Figure 5.20 Two Bernoulli distributions: (a) p = .4 (reasonably symmetric); (b) p = .1 
(very skewed) 


Consider n independent Poisson rv’s X,,..., X,, each having mean value p/n. It 
can be shown that X = X, + --: + X, has a Poisson distribution with mean value ju 
(because in general a sum of independent Poisson rv’s has a Poisson distribution). 
The CLT then implies that a Poisson rv with sufficiently large 4 has approximately 
a normal distribution. A common rule of thumb for this is w > 20. 

Lastly, recall from Section 4.5 that X has a lognormal distribution if In(X) has 
a normal distribution. Let X,, X,,..., X, be a random sample from a distribution 
for which only positive values are possible [P(X, > 0) = 1]. Then if n is sufficiently 
large, the product Y = X,X,-°-°- X, has approximately a lognormal distribution. 

To verify this, note that 


In(¥) = In(X,) + In(X,) + --- + In(X,) 


Since In(Y) is a sum of independent and identically distributed rv’s [the In(X,)’s], it is 
approximately normal when n is large, so Y itself has approximately a lognormal dis- 
tribution. As an example of the applicability of this result, Bury (Statistical Models 
in Applied Science, Wiley, p. 590) argues that the damage process in plastic flow and 
crack propagation is a multiplicative process, so that variables such as percentage 
elongation and rupture strength have approximately lognormal distributions. 


EXERCISES Section 5.4 (46—57) 


46. 


Young’s modulus is a quantitative measure of stiffness of 
an elastic material. Suppose that for aluminum alloy 
sheets of a particular type, its mean value and standard 
deviation are 70 GPa and 1.6 GPa, respectively (values 
given in the article “Influence of Material Properties 
Variability on Springback and Thinning in Sheet 
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Stamping Processes: A Stochastic Analysis” (Intl. J. 

of Advanced Manuf. Tech., 2010: 117-134)). 

a. If X is the sample mean Young’s modulus for a ran- 
dom sample of n = 16 sheets, where is the sampling 
distribution of X centered, and what is the standard 
deviation of the X distribution? 


47. 


49. 


50. 


b. Answer the questions posed in part (a) for a sample 
size of n = 64 sheets. 

c. For which of the two random samples, the one of part 
(a) or the one of part (b), is X more likely to be 
within 1 GPa of 70 GPa? Explain your reasoning. 


Refer to Exercise 46. Suppose the distribution is normal 

(the cited article makes that assumption and even 

includes the corresponding normal density curve). 

a. Calculate P(69 = X = 71) whenn = 16. 

b. How likely is it that the sample mean diameter 
exceeds 71 when n = 25? 


The National Health Statistics Reports dated Oct. 22, 2008, 
stated that for a sample size of 277 18-year-old American 
males, the sample mean waist circumference was 86.3 cm. A 
somewhat complicated method was used to estimate various 
population percentiles, resulting in the following values: 


5th 10% 25th 50th 75th 90th 95th 
69.6 70.9 75.2 81.3 954 107.1 116.4 


a. Is it plausible that the waist size distribution is at 
least approximately normal? Explain your reasoning. 
If your answer is no, conjecture the shape of the 
population distribution. 

b. Suppose that the population mean waist size is 
85 cm and that the population standard deviation is 
15 cm. How likely is it that a random sample of 277 
individuals will result in a sample mean waist size 
of at least 86.3 cm? 

c. Referring back to (b), suppose now that the popu- 
lation mean waist size in 82 cm. Now what is the 
(approximate) probability that the sample mean 
will be at least 86.3 cm? In light of this calcula- 
tion, do you think that 82 cm is a reasonable value 
for pw? 


There are 40 students in an elementary statistics class. 

On the basis of years of experience, the instructor knows 

that the time needed to grade a randomly chosen first 

examination paper is a random variable with an expected 
value of 6 min and a standard deviation of 6 min. 

a. If grading times are independent and the instructor 
begins grading at 6:50 p.m. and grades continuously, 
what is the (approximate) probability that he is 
through grading before the 11:00 p.m. TV news 
begins? 

b. If the sports report begins at 11:10, what is the prob- 
ability that he misses part of the report if he waits 
until grading is done before turning on the TV? 


Let X denote the courtship time for a randomly selected 
female—male pair of mating scorpion flies (time from the 
beginning of interaction until mating). Suppose the mean 
value of X is 120 min and the standard deviation of X is 
110 min (suggested by data in the article “Should I Stay 
or Should I Go? Condition- and Status-Dependent 
Courtship Decisions in the Scorpion Fly Panorpa 
Cognate”’ (Animal Behavior, 2009: 491-497)). 


51. 


52. 


53: 


54. 


55. 


56. 


5.4 The Distribution of the Sample Mean 237 


a. Is it plausible that X is normally distributed? 

b. For a random sample of 50 such pairs, what is the 
(approximate) probability that the sample mean 
courtship time is between 100 min and 125 min? 

c. For a random sample of 50 such pairs, what is the 
(approximate) probability that the total courtship 
time exceeds 150 hr? 

d. Could the probability requested in (b) be calculated 
from the given information if the sample size were 
15 rather than 50? Explain. 


The time taken by a randomly selected applicant for a 
mortgage to fill out a certain form has a normal distribu- 
tion with mean value 10 min and standard deviation 
2 min. If five individuals fill out a form on one day and 
six on another, what is the probability that the sample 
average amount of time taken on each day is at most 
11 min? 


The lifetime of a certain type of battery is normally dis- 
tributed with mean value 10 hours and standard deviation 
1 hour. There are four batteries in a package. What life- 
time value is such that the total lifetime of all batteries in 
a package exceeds that value for only 5% of all packages? 


Rockwell hardness of pins of a certain type is known to 

have a mean value of 50 and a standard deviation of 1.2. 

a. If the distribution is normal, what is the probability 
that the sample mean hardness for a random sample 
of 9 pins is at least 51? 

b. Without assuming population normality, what is the 
(approximate) probability that the sample mean 
hardness for a random sample of 40 pins is at least 
51? 


Suppose the sediment density (g/cm) of a randomly 

selected specimen from a certain region 1s normally distrib- 

uted with mean 2.65 and standard deviation .85 (suggested 
in “Modeling Sediment and Water Column Interactions 

for Hydrophobic Pollutants,’ Water Research, 1984: 

1169-1174). 

a. Ifarandom sample of 25 specimens is selected, what 
is the probability that the sample average sediment 
density is at most 3.00? Between 2.65 and 3.00? 

b. How large a sample size would be required to ensure 
that the first probability in part (a) is at least .99? 


The number of parking tickets issued in a certain city on 

any given weekday has a Poisson distribution with 

parameter pp = 50. 

a. Calculate the approximate probability that between 
35 and 70 tickets are given out on a particular day. 

b. Calculate the approximate probability that the total 
number of tickets given out during a 5-day week is 
between 225 and 275. 

c. Use software to obtain the exact probabilities in (a) 
and (b) and compare to their approximations. 


A binary communication channel transmits a sequence 
of “bits” (Os and 1s). Suppose that for any particular bit 
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transmitted, there is a 10% chance of a transmission error in the first transmission is within 50 of the number of 

(a 0 becoming a | or a 1 becoming a 0). Assume that bit errors in the second? 

errors occur independently of one another. 57. Suppose the distribution of the time X (in hours) spent by 

a. Consider transmitting 1000 bits. What is the approx- students at a certain university on a particular project is 
imate probability that at most 125 transmission gamma with parameters a = 50 and B = 2. Because a is 
errors occur? large, it can be shown that X has approximately a normal 

b. Suppose the same 1000-bit message is sent two distribution. Use this fact to compute the approximate 
different times independently of one another. What is probability that a randomly selected student spends at 
the approximate probability that the number of errors most 125 hours on the project. 


5.5 The Distribution of a Lnear Combination 


The sample mean X and sample total T, are special cases of a type of random vari- 
able that arises very frequently in statistical applications. 


DEFINITION Given a collection of n random variables X,, . . . , X,, and n numerical constants 
Chins ang Ch nls ny 


Y=a%,+-+a,X,= > aX, (5.7) 


i=1 


is called a linear combination of the X;’s. 


For example, consider someone who owns 100 shares of stock A, 200 shares of stock 
B, and 500 shares of stock C. Denote the share prices of these three stocks at some 
particular time by X,, X,, and X3, respectively. Then the value of this individual’s 
stock holdings is the linear combination Y = 100X, + 200X, + 500X,. 


Taking a, =a, ='::=a,=1 gives Y=X,+---+X,=T, and a, = 
d, =--= a, = 1/nyields 
1 1 1 1 a 
Y=—-X,t-3 + X= -Q t+ X) = -7T, =X 
n n n n 


Notice that we are not requiring the X,’s to be independent or identically distrib- 
uted. All the X,’s could have different distributions and therefore different mean 
values and variances. Our first result concerns the expected value and variance of 
a linear combination. 


PROPOSITION Let X,, X,,..., X, have mean values p1,,..., M,, respectively, and variances 
o7, ..., 72, respectively. 
1. Whether or not the X,’s are independent, 
JHGH.G Se Gh ae O29 a GG Te AO.G)) ae I AOG)) ae oe ae CAO.) 
= Gp Fo + Abe, (5.8) 
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2. If X,,..., X,, are independent, 
V(a,X, + aX, +--+ a,X,) = afW(X,) + V(X) +--+ BV(X,) 
=ao,t+-+ao° (5.9) 
and 


Crean = Go, FG, (5.10) 


SS HOD ally Xa eX 


1? 


n 


V(a,X, ++ +4,X,) = >) > aaCov(X, X)) (5.11) 


A =a 


Proofs are sketched out at the end of the section. A paraphrase of (5.8) is that the 
expected value of a linear combination is the same as the linear combination of 
the expected values—for example, E(2X, + 5X,) = 2m, + 5u,. The result (5.9) in 
Statement 2 is a special case of (5.11) in Statement 3; when the X;,’s are independent, 
Cov(X,, X;) = 0 for i Aj and = V(X;) for i =, (this simplification actually occurs 
when the X;,’s are uncorrelated, a weaker condition than independence). Specializing 
to the case of a random sample (X,’s iid) with a; = 1/n for every i gives E(X) = ws and 
V(X) = o7/n, as discussed in Section 5.4. A similar comment applies to the rules for T,. 


EXAMPLE 5.30 A gas station sells three grades of gasoline: regular, extra, and super. These are 
priced at $3.00, $3.20, and $3.40 per gallon, respectively. Let X,, X,, and X, denote 
the amounts of these grades purchased (gallons) on a particular day. Suppose the 
X's are independent with pw, = 1000, uw, = 500, 3; = 300, 7, = 100, a, = 80, 
and 0; = 50. The revenue from sales is Y = 3.0X, + 3.2X, + 3.4X,, and 

E(Y) = 3.0m, + 3.2m, + 3.4, = $5620 
WY) = (3.0)?a7 + (3.2)?a5 + (3.4)?03 = 184,436 


oy = V 184,436 = $429.46 a 


The Difference Between Two 
Random Variables 


An important special case of a linear combination results from taking n = 2, a, = 1, 
and a, = —1: 


Y= a,X, + aX, =X, -X, 


We then have the following corollary to the proposition. 


COROLLARY E(X, — X,) = E(X,) — E(X,) for any two rv’s X, and X,. 
V(X, — X,) = V(X) + V(X.) if X, and X, are independent rv’s. 


The expected value of a difference is the difference of the two expected values. 
However, the variance of a difference between two independent variables is the sum, 
not the difference, of the two variances. There is just as much variability in X, — X, 
as in X, + X, [writing X, — X, = X, + (—1)X,, (— DX, has the same amount of 
variability as X, itself]. 
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EXAMPLE 5.31 


PROPOSITION 


EXAMPLE 5.32 
(Example 5.30 
continued) 
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A certain automobile manufacturer equips a particular model with either a six-cylinder 
engine or a four-cylinder engine. Let X, and X, be fuel efficiencies for independently 
and randomly selected six-cylinder and four-cylinder cars, respectively. With uw, = 22, 
M, = 26, 0, = 1.2, anda, = 1.5, 
E(X, — X,) = wy — By = 22 — 26 = —4 
V(X, — X,) = of + 0} = (1.2)? + (1.5) = 3.69 
Oy _y, = V3.69 = 1.92 


If we relabel so that X, refers to the four-cylinder car, then E(X, — X,) = 4, but the 
variance of the difference is still 3.69. a 


The Case of Normal Random Variables 


When the X,’s form a random sample from a normal distribution, X and T,, are both 
normally distributed. Here is a more general result concerning linear combinations. 


If X,, X,,..., X, are independent, normally distributed rv’s (with possibly dif- 
ferent means and/or variances), then any linear combination of the X,’s also 
has a normal distribution. In particular, the difference X, — X, between two 
independent, normally distributed variables is itself normally distributed. 


The total revenue from the sale of the three grades of gasoline on a particular day 
was Y = 3.0X, + 3.2X, + 3.4X,, and we calculated y= 5620 and (assuming 
independence) o, = 429.46. If the X;,’s are normally distributed, the probability that 
revenue exceeds 4500 is 


4500 — 5620 
P(Y > 4500) = P| Z> 
( aM) ( 429.46 
= P(Z > —2.61) = 1 — ®(—2.61) = .9955 ia 


The CLT can also be generalized so it applies to certain linear combinations. 
Roughly speaking, if n is large and no individual term is likely to contribute too 
much to the overall value, then Y has approximately a normal distribution. 


Proofs for the Case n = 2 
For the result concerning expected values, suppose that X, and X, are continuous 
with joint pdf f(x,, x,). Then 


E(a,X, + a,X,) = | | (a,x, + ayx,)f(x,, x5) dx, dx, 


—00 


=a | Xf (X,, x5) dx, dx, 


—a J—w 


+ a,| | Xof(X 1, X4) dx, dx, 


—a J—a% 


= a,| x fy,0)) dx, + a,| Xofx,(%2) dx, 


= a,E(X,) + a,E(X,) 
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Summation replaces integration in the discrete case. The argument for the variance 
result does not require specifying whether either variable is discrete or continuous. 
Recalling that V(Y) = E[(Y — py)"], 


V(a,X, + a,X,) = E{[a,X, + aX, — (am, + ap,)P} 
= E{ai(X, ~ My)? a a3(X, ~ My)? + 2a,a,(X, — M,)(X — py)} 


The expression inside the braces is a linear combination of the variables Y, = 


(X, — oy), Y, = (% — My)’, and Y3 = (X, 


bM,)(X, — 2), So carrying the E 


operation through to the three terms gives ajV(X,) + ajV(X,) + 2a,a, Cov(X,, X) 


as required. 


EXERCISES Section 5.5 (58-74) 


58. 


59. 


60. 


A shipping company handles containers in three different 
sizes: (1) 27 ft? (3 X 3 X 3), (2) 125 ft’, and (3) 512 ft*. 
Let X; (i = 1, 2, 3) denote the number of type i containers 
shipped during a given week. With pw; = E(X;) and 
a? = V(X,), suppose that the mean values and standard 
deviations are as follows: 


b, = 200 By = 250 
o, = 10 oO, = 12 


b, = 100 

o,=8 

a. Assuming that X,, X,, X; are independent, calculate 
the expected value and variance of the total volume 
shipped. [Hint: Volume = 27X, + 125X, + 512X;.] 


b. Would your calculations necessarily be correct if the 
X,’s were not independent? Explain. 


Let X,, X,, and X, represent the times necessary to per- 

form three successive repair tasks at a certain service 

facility. Suppose they are independent, normal rv’s with 

expected values j1,, 45, and j2, and variances of, a3, and 

03, respectively. 

a. If = by = pw; = 60 and of = 03 = 03 = 15, calcu- 
late P(T, = 200) and P(150 = T, = 200)? 

b. Using the p,’s and o;’s given in part (a), calculate 
both P(55 = X) and P(58 = X = 62). 

c. Using the ,’s and o;’s given in part (a), calculate 
and interpret P(—10 = X, — .5X, — .5X, = 5). 

d. If uw, =40, nw, = 50, w, = 60, of = 10, 03 = 12, 
and o3 = 14, calculate P(X, + X, + X, <= 160) and 
also P(X, + X, = 2X;). 


Refer back to Example 5.31. Two cars with six-cylinder 
engines and three with four-cylinder engines are to be driv- 
en over a 300-mile course. Let X,,... X, denote the resulting 
fuel efficiencies (mpg). Consider the linear combination 


Y = (X, + X,)/2 — (X, + X,+ X,)/3 


which is a measure of the difference between four- 
cylinder and six-cylinder vehicles. Compute P(0 = Y) 
and P(Y > —2). [Hint: Y= a,X,+ --- + a5X;, with 
a, = 1/2,...,45 = —1/3.] 


61. 


62. 


63. 


64. 


Exercise 26 introduced random variables X and Y, the 

number of cars and buses, respectively, carried by a 

ferry on a single trip. The joint pmf of X and Y is given 

in the table in Exercise 7. It is readily verified that X 

and Y are independent. 

a. Compute the expected value, variance, and standard 
deviation of the total number of vehicles on a single 
trip. 

b. If each car is charged $3 and each bus $10, compute 
the expected value, variance, and standard deviation 
of the revenue resulting from a single trip. 


Manufacture of a certain component requires three 
different machining operations. Machining time for 
each operation has a normal distribution, and the three 
times are independent of one another. The mean values 
are 15, 30, and 20 min, respectively, and the standard 
deviations are 1, 2, and 1.5 min, respectively. What is 
the probability that it takes at most 1 hour of machining 
time to produce a randomly selected component? 


Refer to Exercise 3. 

a. Calculate the covariance between X, = the number 
of customers in the express checkout and X, = the 
number of customers in the superexpress checkout. 

b. Calculate V(X, + X,). How does this compare to 
V(X,) + V(X,)? 


Suppose your waiting time for a bus in the morning is 
uniformly distributed on [0, 8], whereas waiting time in 
the evening is uniformly distributed on [0, 10] indepen- 
dent of morning waiting time. 

a. If you take the bus each morning and evening for a 
week, what is your total expected waiting time? 
[Hint: Define rv’s X,, ..., Xj and use a rule of 
expected value.] 

b. What is the variance of your total waiting time? 

c. What are the expected value and variance of the dif- 
ference between morning and evening waiting times 
on a given day? 
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d. What are the expected value and variance of the dif- 
ference between total morning waiting time and total 
evening waiting time for a particular week? 


Suppose that when the pH of a certain chemical 
compound is 5.00, the pH measured by a randomly 
selected beginning chemistry student is a random vari- 
able with mean 5.00 and standard deviation .2. A large 
batch of the compound is subdivided and a sample 
given to each student in a morning lab and each student 
in an afternoon lab. Let X = the average pH as deter- 
mined by the morning students and Y = the average pH 
as determined by the afternoon students. 
a. If pH is a normal variable and there are 25 students 
in each lab, compute P(—.1 = X — Y S.1). [Hint: 
X — ¥Y is a linear combination of normal variables, 
so is normally distributed. Compute wz >and oz_5] 
b. If there are 36 students in each lab, but pH determi- 
nations are not assumed normal, calculate (approxi- 
mately) P(—.1 = X-Y<.1). 


If two loads are applied to a cantilever beam as shown in 
the accompanying drawing, the bending moment at 0 due 
to the loads is a,X, + a)X). 


—_ 


0 


a. Suppose that X, and X, are independent rv’s with 
means 2 and 4 kips, respectively, and standard devia- 
tions .5 and 1.0 kip, respectively. If a, = 5 ft and 
a, = 10 ft, what is the expected bending moment and 
what is the standard deviation of the bending 
moment? 

b. If X, and X, are normally distributed, what is the 
probability that the bending moment will exceed 
75 kip-ft? 

c. Suppose the positions of the two loads are random 
variables. Denoting them by A, and A,, assume that 
these variables have means of 5 and 10 ft, respec- 
tively, that each has a standard deviation of .5, and 
that all A,’s and X;’s are independent of one another. 
What is the expected moment now? 

d. For the situation of part (c), what is the variance of 
the bending moment? 

e. If the situation is as described in part (a) except that 
Corr(X,, X,) = .5 (so that the two loads are not inde- 
pendent), what is the variance of the bending 
moment? 


One piece of PVC pipe is to be inserted inside another 
piece. The length of the first piece is normally distributed 
with mean value 20 in. and standard deviation .5 in. The 
length of the second piece is a normal rv with mean and 
standard deviation 15in. and .4in., respectively. The 
amount of overlap is normally distributed with mean 
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value | in. and standard deviation .1 in. Assuming that 
the lengths and amount of overlap are independent of one 
another, what is the probability that the total length after 
insertion is between 34.5 in. and 35 in.? 


Two airplanes are flying in the same direction in adjacent 

parallel corridors. At time ¢ = 0, the first airplane is 

10 km ahead of the second one. Suppose the speed of the 

first plane (km/hr) is normally distributed with mean 520 

and standard deviation 10 and the second plane’s speed 

is also normally distributed with mean and standard 

deviation 500 and 10, respectively. 

a. What is the probability that after 2 hr of flying, the 
second plane has not caught up to the first plane? 

b. Determine the probability that the planes are sepa- 
rated by at most 10 km after 2 hr. 


Three different roads feed into a particular freeway 
entrance. Suppose that during a fixed time period, the 
number of cars coming from each road onto the freeway 
is a random variable, with expected value and standard 
deviation as given in the table. 


| Road 1 Road2 Road 3 
Expected value 800 1000 600 
Standard deviation 16 25 18 


a. What is the expected total number of cars entering 
the freeway at this point during the period? [Hint: 
Let X; = the number from road i.] 

b. What is the variance of the total number of entering 
cars? Have you made any assumptions about the 
relationship between the numbers of cars on the 
different roads? 

c. With X, denoting the number of cars entering from 
road i during the period, suppose that Cov(X,, X,) = 80, 
Cov(X,, X3) = 90, and Cov(X,, X;) = 100 (so that the 
three streams of traffic are not independent). Compute 
the expected total number of entering cars and the 
standard deviation of the total. 


Consider a random sample of size n from a continuous 
distribution having median 0 so that the probability of 
any one observation being positive is .5. Disregarding the 
signs of the observations, rank them from smallest to 
largest in absolute value, and let W = the sum of the 
ranks of the observations having positive signs. For 
example, if the observations are —.3, +.7, +2.1, and 
—2.5, then the ranks of positive observations are 2 and 3, 
so W= 5. In Chapter 15, W will be called Wilcoxon’s 
signed-rank statistic. W can be represented as follows: 


W=1-¥,+2-¥,+3-¥,t+---+n-¥, 


=Di-y, 


i=1 


where the Y,’s are independent Bernoulli rv’s, each with 
p = 5 (Y; = 1 corresponds to the observation with rank 
i being positive). 
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a. Determine E(Y,) and then E(W) using the equation 
for W. [Hint: The first n positive integers sum to 
n(n + 1)/2.] 

b. Determine V(Y,) and then V(W). [Hint: The sum of 
the squares of the first n positive integers can be 
expressed as n(n + 1)(2n + 1)/6.] 


In Exercise 66, the weight of the beam itself contributes 
to the bending moment. Assume that the beam is of uni- 
form thickness and density so that the resulting load is 
uniformly distributed on the beam. If the weight of the 
beam is random, the resulting load from the weight is 
also random; denote this load by W (kip-ft). 

a. If the beam is 12 ft long, W has mean 1.5 and stan- 
dard deviation .25, and the fixed loads are as described 
in part (a) of Exercise 66, what are the expected value 
and variance of the bending moment? [Hint: If the 
load due to the beam were w kip-ft, the contribution 
to the bending moment would be w{}*x dx.] 

b. If all three variables (X,, X,, and W) are normally 
distributed, what is the probability that the bending 
moment will be at most 200 kip-ft? 


I have three errands to take care of in the Administration 
Building. Let X; = the time that it takes for the ith errand 
(i = 1, 2, 3), and let X, = the total time in minutes that 
I spend walking to and from the building and between 
each errand. Suppose the X;’s are independent, and nor- 
mally distributed, with the following means and standard 
deviations: w, = 15, 0, =4, uw. =5, 0, = 1, pw; = 8, 


73. 
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0; = 2, fy = 12, o, = 3. I plan to leave my office at 
precisely 10:00 a.m. and wish to post a note on my door 
that reads, “I will return by ¢ A.M.” What time ¢ should I 
write down if I want the probability of my arriving after 
t to be .01? 


Suppose the expected tensile strength of type-A steel is 

105 ksi and the standard deviation of tensile strength is 

8 ksi. For type-B steel, suppose the expected tensile 

strength and standard deviation of tensile strength are 

100 ksi and 6 ksi, respectively. Let X = the sample aver- 

age tensile strength of a random sample of 40 type-A 

specimens, and let Y =the sample average tensile 

strength of a random sample of 35 type-B specimens. 

a. What is the approximate distribution of X? Of Y? 

b. What is the approximate distribution of X — Y? 
Justify your answer. 

c. Calculate (approximately) P(—1 = X-Y<=D). 

d. Calculate P(X — Y = 10). If you actually observed 
X — Y = 10, would you doubt that w, — uw, = 5? 


In an area having sandy soil, 50 small trees of a certain 
type were planted, and another 50 trees were planted in an 
area having clay soil. Let X = the number of trees planted 
in sandy soil that survive 1 year and Y = the number of 
trees planted in clay soil that survive 1 year. If the prob- 
ability that a tree planted in sandy soil will survive 1 year 
is .7 and the probability of 1-year survival in clay soil is 
.6, compute an approximation to P(—5 = X — Y = 5) (do 
not bother with the continuity correction). 


SUPPLEMENTARY EXERCISES (75-96) 


75. 


A restaurant serves three fixed-price dinners costing $12, 
$15, and $20. For a randomly selected couple dining at 
this restaurant, let X = the cost of the man’s dinner and 
Y = the cost of the woman’s dinner. The joint pmf of X 
and Y is given in the following table: 
y 
D(x, y) 12 tS 20 
12 .05 05 .10 
x 15 05, .10 35 
20 0 .20 .10 


a. Compute the marginal pmf’s of X and Y. 

b. What is the probability that the man’s and the 
woman’s dinner cost at most $15 each? 

ce. Are X and Y independent? Justify your answer. 

d. What is the expected total cost of the dinner for the 
two people? 

e. Suppose that when a couple opens fortune cookies at 
the conclusion of the meal, they find the message “You 
will receive as a refund the difference between the cost 


76. 


of the more expensive and the less expensive meal that 
you have chosen.” How much would the restaurant 
expect to refund? 


In cost estimation, the total cost of a project is the sum 
of component task costs. Each of these costs is a ran- 
dom variable with a probability distribution. It is cus- 
tomary to obtain information about the total cost dis- 
tribution by adding together characteristics of the 
individual component cost distributions—this is called 
the “roll-up” procedure. For example, E(X, +--+ + 
X,) = E(X,) +--+ + E(X,), so the roll-up procedure is 
valid for mean cost. Suppose that there are two com- 
ponent tasks and that X, and X, are independent, nor- 
mally distributed random variables. Is the roll-up pro- 
cedure valid for the 75th percentile? That is, is the 75th 
percentile of the distribution of X, + X, the same as 
the sum of the 75th percentiles of the two individual 
distributions? If not, what is the relationship between 
the percentile of the sum and the sum of percentiles? 
For what percentiles is the roll-up procedure valid in 
this case? 
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A health-food store stocks two different brands of a cer- 
tain type of grain. Let X = the amount (Ib) of brand A on 
hand and Y = the amount of brand B on hand. Suppose 
the joint pdf of X and Y is 


kxyy x20,y20,20Sx+y=30 
f(x, y) = 
0 otherwise 


a. Draw the region of positive density and determine 
the value of k. 

b. Are X and Y independent? Answer by first deriving 
the marginal pdf of each variable. 

c. Compute P(X + Y = 25). 

d. What is the expected total amount of this grain on 
hand? 

e. Compute Cov(X, Y) and Corr(X, Y). 

f. What is the variance of the total amount of grain on 
hand? 


According to the article “Reliability Evaluation of 
Hard Disk Drive Failures Based on Counting 
Processes” (Reliability Engr. and System Safety, 2013: 
110-118), particles accumulating on a disk drive come 
from two sources, one external and the other internal. 
The article proposed a model in which the internal 
source contains a number of loose particles W having a 
Poisson distribution with mean value yz; when a loose 
particle releases, it immediately enters the drive, and the 
release times are independent and identically distributed 
with cumulative distribution function G(r). Let X denote 
the number of loose particles not yet released at a par- 
ticular time t. Show that X has a Poisson distribution with 
parameter [1 — G(t)]. [Hint: Let Y denote the number of 
particles accumulated on the drive from the internal 
source by time f so that X + Y = W. Obtain an expression 
for P(X = x, Y = y) and then sum over y.] 


Suppose that for a certain individual, calorie intake at 
breakfast is a random variable with expected value 
500 and standard deviation 50, calorie intake at lunch 
is random with expected value 900 and standard 
deviation 100, and calorie intake at dinner is a random 
variable with expected value 2000 and standard devia- 
tion 180. Assuming that intakes at different meals are 
independent of one another, what is the probability 
that average calorie intake per day over the next (365- 
day) year is at most 3500? [Hint: Let X;, Y;, and Z, 
denote the three calorie intakes on day i. Then total 
intake is given by =(X; + Y, + Z)).] 


The mean weight of luggage checked by a randomly 
selected tourist-class passenger flying between two cities 
on a certain airline is 40 lb, and the standard deviation is 
10 lb. The mean and standard deviation for a business- 
class passenger are 30 lb and 6 |b, respectively. 


a. If there are 12 business-class passengers and 50 
tourist-class passengers on a particular flight, what 
are the expected value of total luggage weight and 
the standard deviation of total luggage weight? 
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b. If individual luggage weights are independent, nor- 
mally distributed rv’s, what is the probability that 
total luggage weight is at most 2500 Ib? 


We have seen that if E(X,) = E(X,) E(X,) = pw, 
then E(X, + --- + X,) = np. In some applications, 
the number of X;’s under consideration is not a fixed 
number n but instead is an rv N. For example, let 
N = the number of components that are brought into 
a repair shop on a particular day, and let X, denote the 
repair shop time for the ith component. Then the total 
repair time is X, + X, +--+ + Xy, the sum of a ran- 
dom number of random variables. When N is indepen- 
dent of the X,’s, it can be shown that 


E(X, + +++ + Xy) = EN) 


a. If the expected number of components brought in on 
a particularly day is 10 and expected repair time for 
a randomly submitted component is 40 min, what is 
the expected total repair time for components sub- 
mitted on any particular day? 

b. Suppose components of a certain type come in for 
repair according to a Poisson process with a rate of 5 
per hour. The expected number of defects per compo- 
nent is 3.5. What is the expected value of the total 
number of defects on components submitted for repair 
during a 4-hour period? Be sure to indicate how your 
answer follows from the general result just given. 


Suppose the proportion of rural voters in a certain state 
who favor a particular gubernatorial candidate is .45 
and the proportion of suburban and urban voters favor- 
ing the candidate is .60. If a sample of 200 rural voters 
and 300 urban and suburban voters is obtained, what is 
the approximate probability that at least 250 of these 
voters favor this candidate? 


Let p denote the true pH of a chemical compound. A 
sequence of n independent sample pH determinations 
will be made. Suppose each sample pH is a random vari- 
able with expected value w and standard deviation .1. 
How many determinations are required if we wish the 
probability that the sample average is within .02 of the 
true pH to be at least .95? What theorem justifies your 
probability calculation? 


If the amount of soft drink that I consume on any given 
day is independent of consumption on any other day and 
is normally distributed with w = 13 oz and o = 2 and if 
I currently have two six-packs of 16-o0z bottles, what is 
the probability that I still have some soft drink left at the 
end of 2 weeks (14 days)? 


Refer to Exercise 58, and suppose that the X,’s are inde- 
pendent with each one having a normal distribution. 
What is the probability that the total volume shipped is 
at most 100,000 ft?? 


A student has a class that is supposed to end at 9:00 A.M. 
and another that is supposed to begin at 9:10 A.M. 
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Suppose the actual ending time of the 9 A.M. class is a 
normally distributed rv X, with mean 9:02 and standard 
deviation 1.5 min and that the starting time of the next 
class is also a normally distributed rv X, with mean 9:10 
and standard deviation 1 min. Suppose also that the time 
necessary to get from one classroom to the other is a 
normally distributed rv X, with mean 6 min and standard 
deviation 1 min. What is the probability that the student 
makes it to the second class before the lecture starts? 
(Assume independence of X,, X,, and X, which is reason- 
able if the student pays no attention to the finishing time 
of the first class.) 


Garbage trucks entering a particular waste-management 

facility are weighed prior to offloading their contents. 

Let X = the total processing time for a randomly selected 

truck at this facility (waiting, weighing, and offloading). 

The article “Estimating Waste Transfer Station Delays 

Using GPS” (Waste Mgmt., 2008: 1742-1750) suggests 

the plausibility of a normal distribution with mean 

13 min and standard deviation 4 min for X. Assume that 

this is in fact the correct distribution. 

a. What is the probability that a single truck’s process- 
ing time is between 12 and 15 min? 

b. Consider a random sample of 16 trucks. What is the 
probability that the sample mean processing time is 
between 12 and 15 min? 

c. Why is the probability in (b) much larger than the 
probability in (a)? 

d. What is the probability that the sample mean pro- 
cessing time for a random sample of 16 trucks will 
be at least 20 min? 


Each customer making a particular Internet purchase 
must pay with one of three types of credit cards (think 
Visa, MasterCard, AmEx). Let A; (i = 1, 2, 3) be the 
event that a type i credit card is used, with P(A,) = 55, 
P(A,) = .3, and P(A,) = .2. Suppose that the number of 
customers who make such a purchase on a given day is 
a Poisson rv with parameter \. Define rv’s X, X,, X; by 
X, = the number among the N customers who use a type 
i card (i = 1, 2, 3). Show that these three rv’s are inde- 
pendent with Poisson distributions having parameters 
A, .3A, and .2d, respectively. [Hint: For non-negative 
integers X,, X5, X;, letn = x, + x, + x,. Then P(X, = x), 
Xy = Xq, X, = Xz) = P(X, = X,, Xy = Xy, X, = Hy, N =n) 
[why is this?]. Now condition on N = n, in which case 
the three X7’s have a trinomial distribution (multinomial 
with three categories) with category probabilities .5, .3, 
and .2.| 


a. Use the general formula for the variance of a linear 
combination to write an expression for V(aX + Y). 
Then let a = oy/oy, and show that p = —1. [Hint: 
Variance is always = 0, and Cov(X, Y) = oy: ay: p.] 

b. By considering ViaX — Y), conclude that p = 1. 

c. Use the fact that V(W) = 0 only if W is a constant to 

show that p = 1 only if Y= aX + b. 
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Suppose a randomly chosen individual’s verbal score X 
and quantitative score Y on a nationally administered 
aptitude examination have a joint pdf 


2 
a 52x +3y) OSxS1,05ys1 


0) otherwise 


You are asked to provide a prediction ¢ of the individual’s 
total score X + Y. The error of prediction is the mean 
squared error E[(X + Y — 1t)*]. What value of t minimizes 
the error of prediction? 


a. Let X, have a chi-squared distribution with parame- 
ter v, (see Section 4.4), and let X, be independent of 
X, and have a chi-squared distribution with parame- 
ter v,. Use the technique of Example 5.22 to show 
that X, + X, has a chi-squared distribution with 
parameter v, + V. 

b. In Exercise 71 of Chapter 4, you were asked to show 
that if Z is a standard normal rv, then Z? has a chi- 
squared distribution with v = 1. Let Z,, Z,,..., Z, 
be n independent standard normal rv’s. What is the 
distribution of Z? +---+ Z?? Justify your answer. 

c. Let X;,..., X, be a random sample from a normal 
distribution with mean w and variance o?. What is 
the distribution of the sum Y = ?_,[(X; — )/o}’? 
Justify your answer. 


a. Show that Cov(X, Y + Z) = Cov(X, Y) + Cov(X, Z). 

b. Let X, and X, be quantitative and verbal scores on 
one aptitude exam, and let Y, and Y, be correspon- 
ding scores on another exam. If Cov(X,, Y,) = 5, 
Cov(X,, Y,) = 1, Cov(X,, Y,) = 2, and Cov(X,, Y,) = 
8, what is the covariance between the two total scores 
X, + X, and Y, + Y,? 


A rock specimen from a particular area is randomly 

selected and weighed two different times. Let W 

denote the actual weight and X, and X, the two mea- 

sured weights. Then X, = W+ E, and X,=W+ E,, 

where EF, and E, are the two measurement errors. 

Suppose that the E;,’s are independent of one another 

and of W and that V(E,) = V(E,) = o7. 

a. Express p, the correlation coefficient between the 
two measured weights X, and X,, in terms of o%,, the 
variance of actual weight, and o%, the variance of 
measured weight. 

b. Compute p when o,, = 1 kg and o, = .01 kg. 


Let A denote the percentage of one constituent in a ran- 
domly selected rock specimen, and let B denote the per- 
centage of a second constituent in that same specimen. 
Suppose D and E are measurement errors in determining 
the values of A and B so that measured values are 
X=A+D and Y=B+E, respectively. Assume that 
measurement errors are independent of one another and 
of actual values. 
a. Show that 

Corr(X, Y) = Corr(A, B) - VCorr(X,, X,) +» VCorr(¥,, ¥>) 
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where X, and X, are replicate measurements on the a4 = 4.0 V. Calculate the approximate expected value and 
value of A, and Y, and Y, are defined analogously standard deviation of the current (suggested by “Random 
with respect to B. What effect does the presence of Samplings,’” CHEMTECH, 1984: 696-697). 

ion? 
measurement error have on the correlation? 96. A more accurate approximation to E[h(X,,..., X,)] in 


b. What is the maximum value of Corr(X, Y) when 
Corr(X,, X,) = .8100 and Corr(Y,, Y,) = .9025? Is 


Exercise 95 is 


is di ing? 1 ,farh 1 ,/arh 
this disturbing? eco ; a ) Lee ; ol | 
95. Let X,, ..., X, be independent rv’s with mean values Oxy as 
My +++> My, and variances oT; ene o,. Consider a function Compute this for Y = h(X,, X,, X;, X,) given in Exercise 93, 
A(x,,...,X,), and use it to define arv Y= h(X,,...,X,,). and compare it to the leading term h(y,,..., [,)- 


Under rather general conditions on the / function, if the 
g's are all small relative to the corresponding y,,’s, it can 
be shown that E(Y) ~ h(y,..., ,,) and 


97. Let X and Y be independent standard normal random 
variables, and define a new rv by U = .6X + .8Y. 
a. Determine Corr(X, U). 
V(Y) = ( ah ) nigperdesated ( oh y a) b. How would you alter U to obtain Corr(X, U) = p for 
1 


ox : ox, “s a specified value of p? 

where each partial derivative is evaluated at (x), ..., x,) = 98. Let X,, X5,..., X,, be random variables denoting n inde- 
(11, .-+, M,,). Suppose three resistors with resistances X,, X3, pendent bids for an item that is for sale. Suppose each X; 
X, are connected in parallel across a battery with voltage X,. is uniformly distributed on the interval [100, 200]. If the 
Then by Ohm’s law, the current is seller sells to the highest bidder, how much can he expect 
1 1 1 to earn on the sale? [Hint: Let Y = max(X), X),..., X,,). 
Y x| + t | First find F\(y) by noting that Y = y iff each X; is = y. 

Xx; X, X, 


Then obtain the pdf and E(Y).] 


Let mp, = 100hms, o, = 1.00hm, p, = 15 ohms, 
o, = 1.0 ohm, 1; = 20 ohms, o3 = 1.5 ohms, wy = 120 V, 
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Point Estimation 


INTRODUCTION 


Given a parameter of interest, such as a population mean pw or population pro- 
portion p, the objective of point estimation is to use a sample to compute a 
number that represents in some sense an educated guess for the true value 
of the parameter. The resulting number is called a point estimate. Section 6.1 
introduces some general concepts of point estimation. In Section 6.2, we 
describe and illustrate two important methods for obtaining point estimates: 
the method of moments and the method of maximum likelihood. 
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6.1 Some General Concepts of Point Estimation 


Statistical inference is almost always directed toward drawing some type of conclu- 
sion about one or more parameters (population characteristics). To do so requires 
that an investigator obtain sample data from each of the populations under study. 
Conclusions can then be based on the computed values of various sample quantities. 
For example, let . (a parameter) denote the true average breaking strength of wire 
connections used in bonding semiconductor wafers. A random sample of n = 10 
connections might be made, and the breaking strength of each one determined, 
resulting in observed strengths x), x,,..., X,9. The sample mean breaking strength x 
could then be used to draw a conclusion about the value of wx. Similarly, if 77 is the 
variance of the breaking strength distribution (population variance, another parame- 
ter), the value of the sample variance s? can be used to infer something about a”. 

When discussing general concepts and methods of inference, it is conveni- 
ent to have a generic symbol for the parameter of interest. We will use the Greek 
letter 6 for this purpose. In many investigations, @ will be a population mean p, a 
difference j4; — pL, between two population means, or a population proportion of 
“successes” p. The objective of point estimation is to select a single number, based 
on sample data, that represents a sensible value for 0. As an example, the parameter 
of interest might be pw, the true average lifetime of batteries of a certain type. A 
random sample of n = 3 batteries might yield observed lifetimes (hours) x, = 5.0, 
xX, = 6.4, x, = 5.9. The computed value of the sample mean lifetime is x = 5.77, and 
it is reasonable to regard 5.77 as a very plausible value of 4.—our “best guess” for 
the value of jz based on the available sample information. 

Suppose we want to estimate a parameter of a single population (e.g., 2 or @) 
based on a random sample of size n. Recall from the previous chapter that before data 
is available, the sample observations must be considered random variables (rv’s) X,, 
X,,..., X,, It follows that any function of the X;’s—that is, any statistic—such as the 
sample mean X or sample standard deviation S is also a random variable. The same 
is true if available data consists of more than one sample. For example, we can rep- 
resent tensile strengths of m type 1 specimens and n type 2 specimens by X),..., X,, 
and Y,,..., Y,,, respectively. The difference between the two sample mean strengths is 
X — Y; this is the natural statistic for making inferences about 1,— |1,,the difference 
between the population mean strengths. 


DEFINITION A point estimate of a parameter 6 is a single number that can be regarded as 
a sensible value for 0. It is obtained by selecting a suitable statistic and com- 
puting its value from the given sample data. The selected statistic is called the 
point estimator of 0. 


In the foregoing battery example, the estimator used to obtain the point estimate 
of was X, and the point estimate of 41 was 5.77. If the three observed lifetimes had 
instead been x, = 5.6, x, = 4.5, and x, = 6.1, use of the estimator X would have 
resulted in the estimate x = (5.6 + 4.5 + 6.1)/3 = 5.40. The symbol 6 (“theta hat”) 
is customarily used to denote both the estimator of 0 and the point estimate resulting 
from a given sample.* Thus ji = X is read as “the point estimator of uz is the sample 


* Following earlier notation, we could use (3) (an uppercase theta) for the estimator, but this is cumber- 
some to write. 
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mean X.” The statement “the point estimate of w is 5.77” can be written concisely 
as ft = 5.77. Notice that in writing 6 = 72.5, there is no indication of how this point 
estimate was obtained (what statistic was used). It is recommended that both the esti- 
mator and the resulting estimate be reported. 


EXAMPLE 6.1 An automobile manufacturer has developed a new type of bumper, which is sup- 
posed to absorb impacts with less damage than previous bumpers. The manufacturer 
has used this bumper in a sequence of 25 controlled crashes against a wall, each at 
10 mph, using one of its compact car models. Let X = the number of crashes that result 
in no visible damage to the automobile. The parameter to be estimated is p = the pro- 
portion of all such crashes that result in no damage [alternatively, p = P(no damage 
in a single crash)]. If X is observed to be x = 15, the most reasonable estimator and 
estimate are 


. poee.s . x 15 
estimator p = - estimate = — = — = .60 a 


ne 25 
If for each parameter of interest there were only one reasonable point estima- 


tor, there would not be much to point estimation. In most problems, though, there 
will be more than one reasonable estimator. 


EXAMPLE 6.2 Consider the accompanying 20 observations on dielectric breakdown voltage for 
pieces of epoxy resin first introduced in Exercise 4.89. 


24.46 25.61 26.25 2642 26.66 27.15 27.31 27.54 27.74 27.94 
27.98 28.04 28.28 2849 2850 28.87 29.11 29.13 29.50 30.88 


The pattern in the normal probability plot given there is quite straight, so we now 
assume that the distribution of breakdown voltage is normal with mean value p. 
Because normal distributions are symmetric, w is also the median lifetime of the 
distribution. The given observations are then assumed to be the result of a random 
sample X,, X,,..., X59 from this normal distribution. Consider the following estima- 
tors and resulting estimates for p: 


a. Estimator = X, estimate = x = =x,/n = 555.86/20 = 27.793 
b. Estimator = X, estimate = X = (27.94 + 27.98)/2 = 27.960 


c. Estimator = [min(X,) + max(X,)]/2 = the average of the two extreme lifetimes, 
estimate = [min(x,) + max(x,)]/2 = (24.46 + 30.88)/2 = 27.670 


d. Estimator = X,,;49), the 10% trimmed mean (discard the smallest and largest 
10% of the sample and then average), 


estimate = X14(10) 
555.86 — 24.46 — 25.61 — 29.50 — 30.88 
16 


= 27.838 


Each one of the estimators (a)—(d) uses a different measure of the center of the sample 
to estimate jw. Which of the estimates is closest to the true value? This question can- 
not be answered without knowing the true value. A question that can be answered is, 
“Which estimator, when used on other samples of X;’s, will tend to produce estimates 
closest to the true value?” We will shortly address this issue. a 
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EXAMPLE 6.3. The article “Is a Normal Distribution the Most Appropriate Statistical 
Distribution for Volumetric Properties in Asphalt Mixtures?” first cited in 
Example 4.26, reported the following observations on X = voids filled with asphalt 
(%) for 52 specimens of a certain type of hot-mix asphalt: 


74.33, 71.07 73.82 77.42 79.35 82.27 77.75 78.65 77.19 
74.69 77.25 74.84 60.90 60.75 74.09 65.36 67.84 69.97 
68.83 75.09 62.54 67.47 72.00 66.51 68.21 6446 64.34 
64.93. 67.33 66.08 67.31 74.87 6940 70.83 81.73 82.50 
79.87 81.96 79.51 8412 80.61 79.89 79.70 78.74 77.28 
79.97 75.09 74.38 77.67 83.73 80.39 76.90 


Let’s estimate the variance o? of the population distribution. A natural estimator is 
the sample variance: 


Minitab gave the following output from a request to display descriptive statistics: 


Variable Count Mean SEMean StDev _ Variance Ql Median Q3 
VFA(B) 52 73.880 0.889 6.413 41.126 67.933 74.855 79.470 


Thus the point estimate of the population variance is 


Sa=—zy 
Pasa = 41.126 
52:-= 1 


[alternatively, the computational formula for the numerator of s* gives 
So = Dix? — (Sx,)?/n = 285,929.5964 — (3841.78)2/52 = 2097.4124]. 


A point estimate of the population standard deviation is then 6 = s = V41.126 = 
6.413. 


An alternative estimator results from using the divisor 1 rather than n — 1: 


x5 IO . 2097.4124 
C= =a estimate = a = 40.335 


We will shortly indicate why many statisticians prefer S to this latter estimator. 

The cited article considered fitting four different distributions to the data: 
normal, lognormal, two-parameter Weibull, and three-parameter Weibull. Several 
different techniques were used to conclude that the two-parameter Weibull provided 
the best fit (a normal probability plot of the data shows some deviation from a linear 
pattern). From Section 4.5, the variance of a Weibull random variable is 


o = BAT + 2/a) = (0 + 1/e) Ft 


where a and B are the shape and scale parameters of the distribution. The authors 
of the article used the method of maximum likelihood (see Section 6.2) to estimate 
these parameters. The resulting estimates were @ = 11.9731, B = 77.0153. A sen- 
sible estimate of the population variance can now be obtained from substituting the 
estimates of the two parameters into the expression for 0”; the result is 67 = 56.035. 
This latter estimate is obviously quite different from the sample variance. Its valid- 
ity depends on the population distribution being Weibull, whereas the sample vari- 
ance is a sensible way to estimate o* when there is uncertainty as to the specific form 
of the population distribution. a 
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In the best of all possible worlds, we could find an estimator 6 for which 6 = 0 
always. However, 6 is a function of the sample X,’s, so it is a random variable. For 
some samples, 6 will yield a value larger than 6, whereas for other samples 6 will 
underestimate 6. If we write 


6 = 6 + error of estimation 


then an accurate estimator would be one resulting in small estimation errors, so that 
estimated values will be near the true value. 

A sensible way to quantify the idea of 6 being close to @ is to consider the 
squared error (6 — 6). For some samples, 6 will be quite close to 6 and the result- 
ing squared error will be near 0. Other samples may give values of 6 far from 8, 
corresponding to very large squared errors. An omnibus measure of accuracy is the 
expected or mean square error MSE = E((6 — 6)°]. If a first estimator has smaller 
MSE than does a second, it is natural to say that the first estimator is the better one. 
However, MSE will generally depend on the value of 6. What often happens is that 
one estimator will have a smaller MSE for some values of 6 and a larger MSE for 
other values. Finding an estimator with the smallest MSE is typically not possible. 

One way out of this dilemma is to restrict attention just to estimators that have 
some specified desirable property and then find the best estimator in this restricted 
group. A popular property of this sort in the statistical community is unbiasedness. 


Unbiased Estimators 


Suppose we have two measuring instruments; one instrument has been accurately 
calibrated, but the other systematically gives readings larger than the true value being 
measured. When each instrument is used repeatedly on the same object, because of 
measurement error, the observed measurements will not be identical. However, the 
measurements produced by the first instrument will be distributed about the true value 
in such a way that on average this instrument measures what it purports to measure, 
so it is called an unbiased instrument. The second instrument yields observations that 
have a systematic error component or bias. Figure 6.1 shows 10 measurements from 
both an unbiased and a biased instrument. 


oo Kono-X-K-n9--X-=--KX---K- Kon K--- Koo waeXnnn-X----=- Xa--+XX-="-K-K-=+X-X--=-X 
True value of characteristic True value of characteristic 
(a) (b) 


Figure 6.1 Measurements from (a) an unbiased instrument, and (b) a biased instrument 


DEFINITION A point estimator 6 is said to be an unbiased estimator of 0 if E(6) = 0 for 
every possible value of 0. If @ is not unbiased, the difference E(@) — 6 is called 
the bias of 6. 


That is, 6 is unbiased if its probability (i.e., sampling) distribution is always “cen- 
tered” at the true value of the parameter. Suppose 4 is an unbiased estimator; then if 
6 = 100, the 6 sampling distribution is centered at 100; if @ = 27.5, then the 6 sam- 
pling distribution is centered at 27.5, and so on. Figure 6.2 pictures the distributions 
of several biased and unbiased estimators. Note that “centered” here means that the 
expected value, not the median, of the distribution of 0 is equal to @. 
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pdf of 6, pdf of 4 
Hoe Pade of A; 


_ Pat of 81 


| 
| 
| 
| 
0 . 0 . 
Bias of 0, Bias of 0, 


Figure 6.2 The pdf's of a biased estimator 6, and an unbiased estimator 0, for a parameter 0 


It may seem as though it is necessary to know the value of 6 (in which 
case estimation is unnecessary) to see whether 6 is unbiased. This is not usually 
the case, though, because unbiasedness is a general property of the estimator’s 
sampling distribution—where it is centered—which is typically not dependent on 
any particular parameter value. 

In Example 6.1, the sample proportion X/n was used as an estimator of p, 
where X, the number of sample successes, had a binomial distribution with parame- 
ters n and p. Thus 


E(p E(~ 1 RX : 
(@) = E(—) = EX) =p) =p 


PROPOSITION When X is a binomial rv with parameters n and p, the sample proportion 
p = X/nis an unbiased estimator of p. 


No matter what the true value of p is, the distribution of the estimator p will be cen- 
tered at the true value. 


EXAMPLE 6.4 Suppose that X, the reaction time to a certain stimulus, has a uniform distribution 
on the interval from 0 to an unknown upper limit @ (so the density function of X is 
rectangular in shape with height 1/0 for 0 < x S 9). It is desired to estimate 6 on the 
basis of a random sample X,, X5,..., X,, of reaction times. Since 6 is the largest pos- 
sible time in the entire population of reaction times, consider as a first estimator the 
largest sample reaction time: 6, = max (X,..., X,). Ifn = 5 and x, = 4.2, x, = 1.7, 
x3, = 2.4, x, = 3.9, and x, = 1.3, the point estimate of 6 is 6, = max(4.2, 1.7, 2.4, 
3.9, 1.3) = 4.2. 

Unbiasedness implies that some samples will yield estimates that exceed 0 and 
other samples will yield estimates smaller than 6—otherwise 0 could not possibly 
be the center (balance point) of 6,’s distribution. However, our proposed estima- 
tor will never overestimate 0 (the largest sample value cannot exceed the largest 
population value) and will underestimate 6 unless the largest sample value equals 6. 
This intuitive argument shows that 6 , 1s a biased estimator. More precisely, it can be 
shown (see Exercise 32) that 


iz n n 
= —— +1 f= i < 
E(0,) al 6<@ (sinc re 7 


The bias of 6, is given by n0/(n + 1) — 9 = —0/(n + 1), which approaches 0 as n 
gets large. 
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It is easy to modify 6, to obtain an unbiased estimator of 0. Consider the 
estimator 


~ n+l 
0, = 7 * max (X,,...,X,,) 


Using this estimator on the data gives the estimate (6/5)(4.2) = 5.04. The fact that 
(n + 1)/n > 1 implies that 6, will overestimate @ for some samples and underesti- 
mate it for others. The mean value of this estimator is 


n 


is n+1 n+1 
E(0,) = B | maxon X9| = = E [max(X,,..., X,,)] 


ntl n 
n n+1 


0=06 


If 6, is used repeatedly on different samples to estimate 0, some estimates will be too 
large and others will be too small, but in the long run there will be no systematic ten- 
dency to underestimate or overestimate 6. @ 


Principle of Unbiased Estimation 


When choosing among several different estimators of 0, select one that is 
unbiased. 


According to this principle, the unbiased estimator 6, in Example 6.4 should 
be preferred to the biased estimator 6,. Consider now the problem of estimating 0. 


PROPOSITION Let X,, X,,..., X, be a random sample from a distribution with mean jp and 
variance o*. Then the estimator 


= Se = xy 


G2 = §2 
“ nl 


is unbiased for estimating o?. 


Proof For any rv Y, V(Y) = E(Y’) — [E(Y)/?, so E(Y*) = WY) + [E(Y)]’. 
Applying this to 


gives 


E(S*) = 


o*} =o° (as desired) a 
n-1 
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The estimator that uses divisor 7 can be expressed as (n — 1)S?/n, so 


|" - oS ei E(S)) = n-1 32 
n n 


n 


This estimator is therefore not unbiased. The bias is (n — 1)o?/n — o? = —o7/n. 
Because the bias is negative, the estimator with divisor n tends to underestimate 0, 
and this is why the divisor n — 1 is preferred by many statisticians (though when n 
is large, the bias is small and there is little difference between the two). 

Unfortunately, the fact that S? is unbiased for estimating o? does not imply that 
S is unbiased for estimating o. Taking the square root invalidates the property of 
unbiasedness (the expected value of the square root is not the square root of the 
expected value). Fortunately, the bias of S is small unless n is quite small. There are 
other good reasons to use S as an estimator, especially when the population distribu- 
tion is normal. These will become more apparent when we discuss confidence inter- 
vals and hypothesis testing in the next several chapters. 

In Example 6.2, we proposed several different estimators for the mean p of 
a normal distribution. If there were a unique unbiased estimator for p, the esti- 
mation problem would be resolved by using that estimator. Unfortunately, this is 
not the case. 


PROPOSITION If X,, X,,..., X, is a random sample from a distribution with mean p, then X 
is an unbiased estimator of 2. If in addition the distribution is continuous and 
symmetric, then X and any trimmed mean are also unbiased estimators of pw. 


The fact that X is unbiased is just a restatement of one of our rules of expected value: 
E(X) = p for every possible value of yx (for discrete as well as continuous distribu- 
tions). The unbiasedness of the other estimators is more difficult to verify. 

The next example introduces another situation in which there are several un- 
biased estimators for a particular parameter. 


EXAMPLE 6.5 Under certain circumstances organic contaminants adhere readily to wafer surfaces 
and cause deterioration in semiconductor manufacturing devices. The article 
“Ceramic Chemical Filter for Removal of Organic Contaminants” (J. of the 
Institute of Envir. Sciences and Tech., 2003: 59-65) discussed a recently developed 
alternative to conventional charcoal filters for removing organic airborne molecular 
contamination in cleanroom applications. One aspect of the investigation of filter 
performance involved studying how contaminant concentration in air related to 
concentration on a wafer surface after prolonged exposure. Consider the following 
representative data on x = DBP concentration in air and y = DBP concentration 
on a wafer surface after 4-hour exposure (both in g/m?, where DBP = dibutyl 
phthalate). 


Obs. i: 1 2 3 4 > 6 
x 8 13 #15 3.0 116 26.6 
y 6 Lt 45 35 144 29.1 


The authors comment that “DBP adhesion on the wafer surface was roughly propor- 
tional to the DBP concentration in air.” Figure 6.3 shows a plot of y versus x—i.e., 
of the (x, y) pairs. 
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Figure 6.3 Plot of the DBP data from Example 6.5 


If y were exactly proportional to x, then y = Bx for some value B, which says that 
the (x, y) points in the plot would lie exactly on a straight line with slope 6 passing 
through (0, 0). But this is only approximately the case. So we now assume that for 
any fixed x, wafer DBP is a random variable Y having mean value Bx. That is, we 
postulate that the mean value of Y is related to x by a line passing through (0, 0) 
but that the observed value of Y will typically deviate from this line (this is referred 
to in the statistical literature as “regression through the origin’’). 
Consider the following three estimators for the slope parameter B: 
#1:B = >3 : #2: B = a #3: B = Dail 
n~ Xx; pa? pa 

The resulting estimates based on the given data are 1.3497, 1.1875, and 1.1222, 
respectively. So the estimate definitely depends on which estimator is used. If one of 
these three estimators were unbiased and the other two were biased, there would be 
a good case for using the unbiased one. But all three are unbiased; the argument 
relies on the fact that each one is a linear function of the Y,’s (we are assuming here 
that the x,’s are fixed, not random): 


1wY)\) 1K EY) 1 
{25 2)-ty Ae -t 


Pf 
se 
p{ 2a | 1 
Se) Se Sy i 
In both the foregoing example and the situation involving estimating a normal pop- 
ulation mean, the principle of unbiasedness (preferring an unbiased estimator to a 


biased one) cannot be invoked to select an estimator. What we now need is a criterion 
for choosing among unbiased estimators. 


(SY) = 35m “3s - 


E(Xx¥) = 55 (Dabs) = 


Estimators with Minimum Variance 


Suppose 6, and 6, are two estimators of @ that are both unbiased. Then, although 
the distribution of each estimator is centered at the true value of 6, the spreads of 
the distributions about the true value may be different. 
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Principle of Minimum Variance Unbiased Estimation 


Among all estimators of 6 that are unbiased, choose the one that has minimum 
variance. The resulting @ is called the minimum variance unbiased estima- 
tor (MVUE) of 6. 


Figure 6.4(a) shows distributions of two different unbiased estimators. Use of 
the estimator with the more concentrated distribution is more likely than the other 
one to result in an estimate closer to 0. Figure 6.4(b) displays estimates from the 
two estimators based on 10 different samples. The MVUE is, in a certain sense, the 
most likely among all unbiased estimators to produce an estimate close to the true 6. 


eae 
2 é, oo X-XX-XXX-XX--X-X-=--- n> Comet, >, Conney Cmne>. © Canny @> Cnn Cammy enn 
True value of 0 True value of 0 
True value of 0 (i) (it) 
(a) (b) 


Figure 6.4 (a) Distributions of two unbiased estimators (b) Estimates based 
on 10 different samples 


In Example 6.5, suppose each Y; is normally distributed with mean Px, and vari- 
ance o” (the assumption of constant variance). Then it can be shown that the third 
estimator B = >x,¥,/=x? not only has smaller variance than either of the other two 
unbiased estimators, but in fact is the MVUE—1it has smaller variance than any other 
unbiased estimator of B. 


EXAMPLE 6.6 We argued in Example 6.4 that when X,,..., X,, is a random sample from a uniform 
distribution on [0, 6], the estimator 


i n+ 1 
= 7 max (X,,...,X,) 


n 


is unbiased for 0 (we previously denoted this estimator by 6,). This is not the only 
unbiased estimator of 6. The expected value of a uniformly distributed rv is just the 
midpoint of the interval of positive density, so E(X,) = 0/2. This implies that E(X) = 
6/2, from which E(2 X) = 0. That is, the estimator 6, = = 2X is unbiased for 0. 
If X is uniformly distributed on the interval from A to B, then V(X) = 
= (B — A)*/12. Thus, in our situation, V(X,) = 67/12, V(X) = 0?/n = 6?/(12n), 
and Vv(65) V(2X) = 4V(X) = 6?/(3n). The results of Exercise 32 can be used 
to show that v6, ) = 6?/[n(n + 2)]. The estimator 6, has smaller variance than 
does 6, if 3n< nn + 2)—that is, if 0<n?—n=n(n— 1). As long as n> 1, 
vO, y< Vv), so 6, is a better estimator than 6. More advanced methods can be 
used to show that 6; is the MVUE of @—every other unbiased estimator of @ has 
variance that exceeds 67/[n(n + 2)]. _| 
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One of the triumphs of mathematical statistics has been the development of 
methodology for identifying the MVUE in a wide variety of situations. The most 
important result of this type for our purposes concerns estimating the mean p of a 
normal distribution. 


THEOREM Let X,,..., X, be a random sample from a normal distribution with parameters 
p and o. Then the estimator / = X is the MVUE for p. 


Whenever we are convinced that the population being sampled is normal, the theo- 
rem says that x should be used to estimate ww. In Example 6.2, then, our estimate 
would be x= 27.793. 

In some situations, it is possible to obtain an estimator with small bias that 
would be preferred to the best unbiased estimator. This is illustrated in Figure 6.5. 
However, MVUEs are often easier to obtain than the type of biased estimator whose 
distribution is pictured. 


os of é., a biased estimator 


pdf of 4, the MVUE 


ia 


0 


Figure 6.5 A biased estimator that is preferable to the MVUE 


Some Complications 


The last theorem does not say that in estimating a population mean y, the estimator 
X should be used irrespective of the distribution being sampled. 


EXAMPLE 6.7 Suppose we wish to estimate the thermal conductivity u of a certain material. Using 
standard measurement techniques, we will obtain a random sample X,,..., X,, of n 
thermal conductivity measurements. Let’s assume that the population distribution is 
a member of one of the following three families: 


fa) = — e-OWIO) ow << 0 (6.1) 
TO 
1 
= omcx< oo 6:2 
~All = e= wi 7 _ 
press p+e 
fle) = ) “* a 
0 otherwise 


The pdf (6.1) is the normal distribution, (6.2) is called the Cauchy distribution, and 
(6.3) is a uniform distribution. All three distributions are symmetric about jx. The 
Cauchy density curve is bell-shaped but with much heavier tails (more probability far- 
ther out) than the normal curve. In fact, the tails are so heavy that the mean value does 
not exist, though p is still the median and a location parameter for the distribution. 
The uniform distribution has no tails. The four estimators for 2 considered earlier are 
Xx _X, (the average of the two extreme observations), and X cio): a trimmed mean. 
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The very important moral here is that the best estimator for 4. depends cru- 
cially on which distribution is being sampled. In particular, 


1. If the random sample comes from a normal distribution, then X is the best of the 
four estimators, since it has minimum variance among all unbiased estimators. 


2. If the random sample comes from a Cauchy distribution, then X and X, are ter- 
rible estimators for 4, whereas X is quite good (the MVUE is not known); X 
is bad because it is very sensitive to outlying observations, and the heavy tails 
of the Cauchy distribution make a few such observations likely to appear in any 
sample. 


3. If the underlying distribution is uniform, the best estimator is X,; this estimator 
is greatly influenced by outlying observations, but the lack of tails makes such 
observations impossible. 


4. The trimmed mean is best in none of these three situations but works reasona- 
bly well in all three. That is, X,,;49)does not suffer too much in comparison with 
the best procedure in any of the three situations. a 


More generally, recent research in statistics has established that when estimating 
a point of symmetry p of a continuous probability distribution, a trimmed mean with 
trimming proportion 10% or 20% (from each end of the sample) produces reasonably 
behaved estimates over a very wide range of possible models. For this reason, a trimmed 
mean with small trimming percentage is said to be a robust estimator. 

In some situations, the choice is not between two different estimators con- 
structed from the same sample, but instead between estimators based on two differ- 
ent experiments. 


EXAMPLE 6.8 Suppose a certain type of component has a lifetime distribution that is exponential with 
parameter A so that expected lifetime is u = 1/A. A sample of n such components is 
selected, and each is put into operation. If the experiment is continued until all n life- 
times, X),..., X,,, have been observed, then X is an unbiased estimator of b. 

In some experiments, though, the components are left in operation only until 
the time of the rth failure, where r < n. This procedure is referred to as censoring. 
Let Y, denote the time of the first failure (the minimum lifetime among the n com- 
ponents), Y, denote the time at which the second failure occurs (the second smallest 
lifetime), and so on. Since the experiment terminates at time Y,, the total accumu- 
lated lifetime at termination is 


T,= SY,+—- ny, 
i=l 
We now demonstrate that & = 7,/r is an unbiased estimator for w. To do so, we 
need two properties of exponential variables: 
1. The memoryless property (see Section 4.4), which says that at any time point, 
remaining lifetime has the same exponential distribution as original lifetime. 
2. When X,,..., X, are independent, each exponentially distributed with parameter 
A, min(X,,..., X;,), is exponential with parameter kA. 
Since all n components last until Y,, n — 1 last an additional Y, — Y,,n — 2 an addi- 
tional Y, — Y, amount of time, and so on, another expression for T,, is 
T, =nY, + (n— 1I)(% — Y,) + 1 — 2)(¥3 — Y,) + --- 
+ (n= rt IY, = Y,4) 
But Y, is the minimum of n exponential variables, so E(Y,) = 1/(mA). Similarly, 
Y,—Y, is the smallest of the n—1 remaining lifetimes, each exponential with 
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parameter A (by the memoryless property), so E(Y, — Y,) = 1/[(n — 1)A]. Continuing, 
E(Y,,, — ¥) = 1/[(n — dA, so 


E(T,) = nE(Y,) + (n- DEY, — Y,) +--+ + (a -— r+ DEY, — Y,_4) 


1 1 1 
=n: +(n-—1)- feeb (a= r+ 1) 
nXr (n— 1)r (n-—r+I1)rA 
=e 
A 


Therefore, E(T,/r) = (1/nE(T,) = (1/r) + (r/A) = 1/A = was claimed. 
As an example, suppose 20 components are tested and r = 10. Then if the first 
ten failure times are 11, 15, 29, 33, 35, 40, 47, 55, 58, and 72, the estimate of p is 
11+ 15 +---+ 72+ (10)(72) 


— =o115 
B 10 


The advantage of the experiment with censoring is that it terminates more quickly 
than the uncensored experiment. However, it can be shown that V(T,/r) = 1/(A7r), 
which is larger than 1/(A2n), the variance of X in the uncensored experiment. & 


Reporting a Point Estimate: 
The Standard Error 


Besides reporting the value of a point estimate, some indication of its precision should 
be given. The usual measure of precision is the standard error of the estimator used. 


DEFINITION The standard error of an estimator 6 is its standard deviation og =V V(6). 
It is the magnitude of a typical or representative deviation between an estimate 
and the value of @. If the standard error itself involves unknown parameters 
whose values can be estimated, substitution of these estimates into ag yields 
the estimated standard error (estimated standard deviation) of the estimator. 
The estimated standard error can be denoted either by Gs (the ~ over o empha- 
sizes that a is being estimated) or by 59. 


EXAMPLE 6.9 Assuming that breakdown voltage is normally distributed, (1 = X is the best estima- 
(Example 6.2 tor of w. If the value of o is known to be 1.5, the standard error of X is 


continued) ox = a/Vn = 1.5/V/20 = .335. If, as is usually the case, the value of o is 
unknown, the estimate @ = s = 1.462 is substituted into ox to obtain the estimated 
standard error Gy = sy = s/Vn = 1.462/V/20 = .327. a 


EXAMPLE 6.10 The standard error of p = X/n is 
(Example 6.1 


i V n 
continued) i= ViX/n) = J = _ \ _ 2 
n n 


Since p and gq = 1 — p are unknown (else why estimate?), we substitute p = x/n 
and q = 1 — x/n into o,, yielding the estimated standard error 6, = Vpq/n = 
V(.6)(.4)/25 = .098. Alternatively, since the largest value of pq is attained when 
p = q =.5, an upper bound on the standard error is V1/(4n) = .10. a 


When the point estimator 6 has approximately a normal distribution, which 
will often be the case when n is large, then we can be reasonably confident that the 
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true value of 6 lies within approximately 2 standard errors (standard deviations) of 6. 
Thus if a sample of 2 = 36 component lifetimes gives = x = 28.50 and s = 3.60, 
then s/ Vn = .60, so within 2 estimated standard errors, fi translates to the interval 
28.50 = (2)(.60) = (27.30, 29.70). 

If 6 is not necessarily approximately normal but is unbiased, then it can be 
shown that the estimate will deviate from 6 by as much as 4 standard errors at most 
6% of the time. We would then expect the true value to lie within 4 standard errors 
of 6 (and this is a very conservative statement, since it applies to any unbiased 6). 
Summarizing, the standard error tells us roughly within what distance of 6 we can 
expect the true value of 0 to lie. 

The form of the estimator 6 may be sufficiently complicated so that standard 
statistical theory cannot be applied to obtain an expression for oy: This is true, for 
example, in the case 8 = o, 6=S ; the standard deviation of the statistic S$, 7, cannot 
in general be determined. In recent years, a new computer-intensive method called the 
bootstrap has been introduced to address this problem. Suppose that the population 
pdf is f(x; 0), a member of a particular parametric family, and that data x,, x5,..., x, 
gives 6 = 21.7. We now use statistical software to obtain “bootstrap samples” from 
the pdf f(x; 21.7), and for each sample calculate a “bootstrap estimate” 6*: 


First bootstrap sample: x, x¥,...,x*; estimate = 6F 
Second bootstrap sample: x*, x#,...,x*; estimate = 0# 
Bth bootstrap sample: x*,x¥,...,x*; estimate = 07 


B = 100 or 200 is often used. Now let 0* = D6*/ B, the sample mean of the bootstrap 
estimates. The bootstrap estimate of 0’s standard error is now just the sample stan- 
dard deviation of the 6*’s: 


1 A%* Dx*\2 
Sp \ride 6*) 


(In the bootstrap literature, B is often used in place of B — 1; for typical values of B, 
there is usually little difference between the resulting estimates.) 


EXAMPLE 6.11 _ A theoretical model suggests that X, the time to breakdown of an insulating fluid 
between electrodes at a particular voltage, has f(x; A) = Ae, an exponential distri- 
bution. A random sample of n = 10 breakdown times (min) gives the following data: 


41.53 18.73 2.99 30.34 12.33 117.52 73.02 223.63 4.00 26.78 


Since E(X) = 1/A, E(X) = 1/A, so a reasonable estimate of A isd = 1/¥ = 1/55.087 = 
.018153. We then used a statistical computer package to obtain B = 100 bootstrap 
samples, each of size 10, from f(x; .018153). The first such sample was 
41.00, 109.70, 16.78, 6.31, 6.76, 5.62, 60.96, 78.81, 192.25, 27.61, from which 
~x* = 545.8 and Ay = 1/54.58 = .01832. The average of the 100 bootstrap esti- 
mates is A* = .02153, and the sample standard deviation of these 100 estimates is 
5; = .0091, the bootstrap estimate of \’s standard error. A histogram of the 100A*’s 
was somewhat positively skewed, suggesting that the sampling distribution of d 
also has this property. a 


Sometimes an investigator wishes to estimate a population characteristic without 
assuming that the population distribution belongs to a particular parametric family. An 
instance of this occurred in Example 6.7, where a 10% trimmed mean was proposed 
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for estimating a symmetric population distribution’s center 6. The data of Example 6.2 
gave 6 = X10) = 27-838, but now there is no assumed f(x; 6), so how can we obtain a 
bootstrap sample? The answer is to regard the sample itself as constituting the popula- 
tion (the n = 20 observations in Example 6.2) and take B different samples, each of size 
n, with replacement from this population. Several of the books listed in the chapter 
bibliography provide more information about bootstrapping. 


EXERCISES Section 6.1 (1—19) 


The accompanying data on flexural strength (MPa) for 
concrete beams of a certain type was introduced in 
Example 1.2. 


59 12 13 6.3 8.1 6.8 7.0 
76 6.8 6.5 7.0 6.3 79 9.0 
8.2 8.7 7.8 9.7 TA 7.7 9.7 
7.8 7.7 11.6 11.3 11.8 10.7 


a. Calculate a point estimate of the mean value of 
strength for the conceptual population of all beams 
manufactured in this fashion, and state which estima- 
tor you used. [Hint: =x; = 219.8.] 

b. Calculate a point estimate of the strength value 
that separates the weakest 50% of all such beams 
from the strongest 50%, and state which estimator 
you used. 

c. Calculate and interpret a point estimate of the popu- 
lation standard deviation 0. Which estimator did you 
use? [Hint: =x? = 1860.94.] 

d. Calculate a point estimate of the proportion of all such 
beams whose flexural strength exceeds 10 MPa. 
[Hint: Think of an observation as a “success” if it 
exceeds 10.] 

e. Calculate a point estimate of the population coeffi- 
cient of variation o/, and state which estimator you 
used. 


The National Health and Nutrition Examination 
Survey (NHANES) collects demographic, socioeco- 
nomic, dietary, and health-related information on an 
annual basis. Here is a sample of 20 observations on 
HDL cholesterol level (mg/dl) obtained from the 2009-— 
2010 survey (HDL is “‘good” cholesterol; the higher its 
value, the lower the risk for heart disease): 


35 49 52 54 65 51 51 
47 86 36 46 33 39 45 
39 63 95 35 30 48 


a. Calculate a point estimate of the population mean 
HDL cholesterol level. 

b. Making no assumptions about the shape of the popu- 
lation distribution, calculate a point estimate of the 
value that separates the largest 50% of HDL levels 
from the smallest 50%. 


c. Calculate a point estimate of the population standard 
deviation. 

d. An HDL level of at least 60 is considered desirable 
as it corresponds to a significantly lower risk of heart 
disease. Making no assumptions about the shape of 
the population distribution, estimate the proportion p 
of the population having an HDL level of at least 60. 


Consider the following sample of observations on coat- 
ing thickness for low-viscosity paint (‘Achieving a 
Target Value for a Manufacturing Process: A Case 
Study,” J. of Quality Technology, 1992: 22-26): 


83 88 =6.88 )=—-11.04 1.09 1.12 1.29 1.31 
148 149 159 1.62 1.65 1.71 1.76 1.83 


Assume that the distribution of coating thickness is 
normal (a normal probability plot strongly supports this 
assumption). 

a. Calculate a point estimate of the mean value of coat- 
ing thickness, and state which estimator you used. 

b. Calculate a point estimate of the median of the coat- 
ing thickness distribution, and state which estimator 
you used. 

c. Calculate a point estimate of the value that separates the 
largest 10% of all values in the thickness distribution 
from the remaining 90%, and state which estimator you 
used. [Hint: Express what you are trying to estimate in 
terms of js and o.] 

d. Estimate P(X < 1.5), i.e., the proportion of all thick- 
ness values less than 1.5. [Hint: If you knew the 
values of 2 and o, you could calculate this probabil- 
ity. These values are not available, but they can be 
estimated. ] 

e. What is the estimated standard error of the estimator 
that you used in part (b)? 

The article from which the data in Exercise 1 was extracted 


also gave the accompanying strength observations for 
cylinders: 

61 5.8 7.8 7.1 7.2 92 66 83 7.0 8.3 
78 81 74 85 89 98 9.7 14.1 12.6 11.2 
Prior to obtaining data, denote the beam strengths by 
X,,..., X,, and the cylinder strengths by Y,,..., Y, 


m n 


Suppose that the X;,’s constitute a random sample from 
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a distribution with mean pw, and standard deviation o, 
and that the Y,’s form a random sample (independent 
of the X;’s) from another distribution with mean 2, and 


sample of adults with chronic kidney disease. Here is 
representative data (consistent with summary quantities 
and descriptions in the cited article): 


standard deviation o;,. 2.6 62 74 96 WS 135 145 170 
=. 20.0 288 295 295 417 457 562 562 

a. Use rules of expected value to show that X — Y is 66.1 66.1 676 741 977 1413 1479 1778 
an unbiased estimator of (4, — fj. Calculate the esti- —1g62 186.2 190.6 208.9 229.1 229.1 288.4 288.4 
mate for the given data. 346.7 4074 4266 575.4 6166 7244 8128 1122.0 


b. Use rules of variance from Chapter 5 to obtain an 
expression for the variance and standard deviation 
(standard error) of the estimator in part (a), and then 
compute the estimated standard error. 

c. Calculate a point estimate of the ratio o,/a, of the 
two standard deviations. 

d. Suppose a single beam and a single cylinder are ran- 
domly selected. Calculate a point estimate of the vari- 
ance of the difference X — Y between beam strength 
and cylinder strength. 


As an example of a situation in which several different 
statistics could reasonably be used to calculate a point 


which urinary AGT level (wg) was determined for a 
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An appropriate probability plot supports the use of the 

lognormal distribution (see Section 4.5) as a reasonable 

model for urinary AGT level (this is what the investiga- 

tors did). 

a. Estimate the parameters of the distribution. [Hint: 
Remember that X has a lognormal distribution with 
parameters jz and o? if In(X) is normally distributed 
with mean pw and variance o.] 

b. Use the estimates of part (a) to calculate an estimate 
of the expected value of AGT level. [Hint: What is 
E(X)?] 


: : : baa E 7. a. A random sample of 10 houses in a particular area, 
ineeagonal consider i populabion of N ae Associated each of which is heated with natural gas, is selected 
with each invoice is its “book value,’ the recorded and the amount of gas (therms) used during the 
amount of that invoice. Let T denote the total book value, month of January is determined for each house. The 
a known amount. Some of these book values are errone- resulting observations are 103, 156, 118, 89, 125, 
ous. An audit will be carried out by randomly selecting i 147, 122, 109, 138, 99. Let pz denote the average gas 
invoices and determining the audited (correct) value for usage during January by all houses in this area. 
each one. Suppose that the sample gives the following Compute a point estimate of p. 
results (in dollars). 

b. Suppose there are 10,000 houses in this area that use 
Invoice natural gas for heating. Let 7 denote the total amount 
of gas used by all of these houses during January. 
1 2 3 4 5 Estimate 7 using the data of part (a). What estimator 
did you use in computing your estimate? 
Book value 300 720 526 200 127 c. Use the data in part (a) to estimate p, the proportion 
Audited value 300 520 526 200 157 of all houses that used at least 100 therms. 
Error 0 200 0 0 —30 d. Give a point estimate of the population median usage 
(the middle value in the population of all houses) 
based on the sample of part (a). What estimator did 
L Pp Pp 
et 
you use? 

Y= sample mean book value 8. In arandom sample of 80 components of a certain type, 

X = sample mean audited value 12 are found to be defective. 

D = sample mean error a. Give a point estimate of the proportion of all such 

see ee 8 components that are not defective. 
Brpore une as iui ciee adage ena o: ‘a b. A system is to be constructed by randomly selecting 
total audited (i.e., correct) value—one involving just ic Uae Weeds terre auaeine eal ConNeetam: MRR 
N and X, another involving T, N, and D, and the last sevice aechawn ae g 
involving T and X/Y. If N = 5000 and T = 1,761,300, ° : 
calculate the three corresponding point estimates. (The 
article “Statistical Models and Analysis in Auditing,” 
Statistical Science, 1989: 2-33 discusses properties of The series connection implies that the system will func- 
these estimators.) tion if and only if neither component is defective (1.e., 
Urinary angiotensinogen (AGT) level is one quantitative both components work properly). Estimate the PEOROE 
indicator of kidney function. The article “Urinary tion of all such Systenis that work properly. [Hint: If p 
Angiotensinogen as a Potential Biomarker of Chronic denotes the probability that a component works properly, 
Kidney Diseases” (J. of the Amer. Society of how can P(system works) be expressed in terms of p?] 
Hypertension, 2008: 349-354) describes a study in 9. Each of 150 newly manufactured items is examined and 


the number of scratches per item is recorded (the items 


are supposed to be free of scratches), yielding the follow- 
ing data: 


Number of 
scratches 
per item 0 1 2 3 4 5 6 7 


Observed 
Jrequency 18 37 42 30 13 i, 2 1 


Let X = the number of scratches on a randomly chosen 
item, and assume that X has a Poisson distribution with 
parameter yj. 

a. Find an unbiased estimator of 2 and compute the esti- 
mate for the data. [Hint: E(X) = p for X Poisson, so 
E(X) = 7 

b. What is the standard deviation (standard error) of your 
estimator? Compute the estimated standard error. 
[Hint: 0%, = p for X Poisson.] 


10. Using a long rod that has length ws, you are going to lay 
out a square plot in which the length of each side is p. 
Thus the area of the plot will be 7. However, you do not 
know the value of , so you decide to make n indepen- 
dent measurements X,, X,, ..., X,, of the length. Assume 
that each X; has mean p (unbiased measurements) and 
variance o”. 

a. Show that X? is not an unbiased estimator for p. [Hint: 
For any rv Y, E(Y*) = VY) + [E(Y)P. Apply this with 
Y=X] 

b. For what value of k is the estimator X? — kS* unbi- 
ased for 12? [Hint: Compute E(X? — kS”).] 


11. Of, randomly selected male smokers, X, smoked filter 
cigarettes, whereas of n, randomly selected female 
smokers, X, smoked filter cigarettes. Let p, and p, denote 
the probabilities that a randomly selected male and 
female, respectively, smoke filter cigarettes. 

a. Show that (X,/n,) — (X,/n,) is an unbiased estimator 
for p, — p>. [Hint: E(X,) = n,p; for i = 1, 2.] 

b. What is the standard error of the estimator in part (a)? 

c. How would you use the observed values x, and x, to 
estimate the standard error of your estimator? 

d. If n, =n, = 200, x, = 127, and x, = 176, use the 
estimator of part (a) to obtain an estimate of p,; — pp. 

e. Use the result of part (c) and the data of part (d) to 
estimate the standard error of the estimator. 


12. Suppose a certain type of fertilizer has an expected yield 
per acre of 4, with variance o*, whereas the expected 
yield for a second type of fertilizer is 4, with the same 
variance o*. Let S} and S35 denote the sample variances of 
yields based on sample sizes n, and n,, respectively, of the 
two fertilizers. Show that the pooled (combined) estimator 


(n, — 1)St + (ny — 1)S5 
n, tn, - 2 


C= 


is an unbiased estimator of o7. 


13. 


14. 


15. 


16. 


17. 
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Consider a random sample X,,..., X,, from the pdf 


-lexzl 


fl; 0) = .5(1 + Ox) 


where —1 = 6 <1 (this distribution arises in particle 
physics). Show that 6 = 3X is an unbiased estimator of 
0. [Hint: First determine wp = E(X) = E(X).] 


A sample of n captured Pandemonium jet fighters results 
in serial numbers x,, x, X3,..., X,.. The CIA knows that 
the aircraft were numbered consecutively at the factory 
starting with a and ending with B, so that the total num- 
ber of planes manufactured is B — a + 1 (e.g.,ifa = 17 
and B = 29, then 29 — 17+ 1 = 13 planes having 
serial numbers 17, 18, 19,..., 28, 29 were manufac- 
tured). However, the CIA does not know the values of a 
or B. A CIA statistician suggests using the estimator 
max(X;) — min(X;) + 1 to estimate the total number of 
planes manufactured. 

a. Ifn = 5,x, = 237, x, = 375, x, = 202, x, = 525, and 

Xx; = 418, what is the corresponding estimate? 


b. Under what conditions on the sample will the value of 
the estimate be exactly equal to the true total number of 
planes? Will the estimate ever be larger than the true 
total? Do you think the estimator is unbiased for esti- 
mating B — a + 1? Explain in one or two sentences. 


Let X,, X,,..., X,, represent a random sample from a 
Rayleigh distribution with pdf 


fad=peFn x>0 


a. It can be shown that E(X*) = 20. Use this fact to con- 
struct an unbiased estimator of @ based on 2X? (and 
use rules of expected value to show that it is unbiased). 

b. Estimate 6 from the following n = 10 observa- 
tions on vibratory stress of a turbine blade under 
specified conditions: 


16.88 
14.23 


10.23 
19.87 


4.59 
9.40 


6.66 
6.51 


13.68 
10.95 


Suppose the true average growth mw of one type of plant 

during a 1-year period is identical to that of a second 

type, but the variance of growth for the first type is 0°, 
whereas for the second type the variance is 407. Let 

X,,..., X,, be m independent growth observations on the 

first type [so E(X,) = pw, V(X) = o”], and let Vizier, Y, 

be n independent growth observations on the second 

type [E(Y,) = w, VY) = 4°]. 

a. Show that the estimator = 5X + (1 — 5)¥ is unbi- 
ased for w (for 0 < 6 <1, the estimator is a weighted 
average of the two individual sample means). 

b. For fixed m and n, compute V(j1), and then find the 
value of 6 that minimizes V(j1). [Hint: Differentiate 
V(f) with respect to 6.] 


In Chapter 3, we defined a negative binomial rv as the 
number of failures that occur before the rth success in a 
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sequence of independent and identical success/failure code. Having obtained a random sample of n students, she 
trials. The probability mass function (pmf) of X is realizes that asking each, “Have you violated the honor 
nb(x; rp) = code?” will probably result in some untruthful responses. 


Consider the following scheme, called a randomized 
i eo, ') p(l—py x=0,1,2,... response technique. The investigator makes up a deck of 
x 100 cards, of which 50 are of type I and 50 are of type II. 


a. Suppose that r = 2. Show that Type I: Have you violated the honor code (yes or no)? 
p=(r-1)/X+r-1) Type II: Is the last digit of your telephone number a 0, 


5 
is an unbiased estimator for p. [Hint: Write out E(p) are Oo ornen 


and cancel x + r — 1 inside the sum.] Each student in the random sample is asked to mix 
b. A reporter wishing to interview five individuals who the deck, draw a card, and answer the resulting question 
support a certain candidate begins asking people truthfully. Because of the irrelevant question on type II 
whether (S) or not (F) they support the candidate. If cards, a yes response no longer stigmatizes the respond- 
the sequence of responses is SFFSFFFSSS, estimate ent, so we assume that responses are truthful. Let p denote 
p = the true proportion who support the candidate. the proportion of honor-code violators (i.e., the prob- 


ability of a randomly selected student being a violator), 

and let A = P(yes response). Then A and p are related by 

A = 5p + (.5)(.3). 

a. Let Y denote the number of yes responses, so Y ~ Bin 
(n, A). Thus Y/n is an unbiased estimator of A. Derive 
an estimator for p based on Y. If n = 80 and y = 20, 
what is your estimate? [Hint: Solve A = .5p + .15 for 
p and then substitute Y/n for A.] 

b. Use the fact that E(Y/n) = A to show that your esti- 
mator p is unbiased. 


19. An investigator wishes to estimate the proportion of stu- c. If there were 70 type I and 30 type II cards, what 
dents at a certain university who have violated the honor would be your estimator for p? 


18. Let X,, X5,..., X, be a random sample from a pdf f(x) 
that is symmetric about pz, so that X is an unbiased esti- 
mator of pw. If n is large, it can be shown that V 
(X) ~ 1/(4nl f()P). 

a. Compare Vix ) to V(X) when the underlying distribu- 
tion is normal. 

b. When the underlying pdf is Cauchy (see Example 
6.7), V(X) = ©, so X is a terrible estimator. What is V 
(X) in this case when n is large? 


6.2 Methods of Point Estimation 


We now introduce two “constructive” methods for obtaining point estimators: the 
method of moments and the method of maximum likelihood. By constructive we 
mean that the general definition of each type of estimator suggests explicitly how 
to obtain the estimator in any specific problem. Although maximum likelihood esti- 
mators are generally preferable to moment estimators because of certain efficiency 
properties, they often require significantly more computation than do moment esti- 
mators. It is sometimes the case that these methods yield unbiased estimators. 


The Method of Moments 


The basic idea of this method is to equate certain sample characteristics, such as the 
mean, to the corresponding population expected values. Then solving these equa- 
tions for unknown parameter values yields the estimators. 


DEFINITION Let X,,..., X, be a random sample from a pmf or pdf f(x). For k = 1, 2, 
3,..., the kth population moment, or kth moment of the distribution f(x), 
is E(X*). The kth sample moment is (1/7)>/_,X*. 
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Thus the first population moment is E(X) = p, and the first sample moment 
is =X,/n = X. The second population and sample moments are E(X*) and 
>=X?/n, respectively. The population moments will be functions of any unknown 
parameters 6), @5,.... 


DEFINITION Let X,, X,,..., X, be a random sample from a distribution with pmf or pdf 
F(x; 6;,..., 8,,), where 6,,..., 6,,, are parameters whose values are unknown. Then 
the moment estimators 614.--s0m are obtained by equating the first m sample 
moments to the corresponding first m population moments and solving for 
Disssay) 


sai m 


If, for example, m = 2, E(X) and E(X’) will be functions of 6, and @,. Setting 
E(X) = (1/n)=X, (= X) and E(X?) = (1/n)=X? gives two equations in 6, and 6,. 
The solution then defines the estimators. 


EXAMPLE 6.12 Let X,, X,,..., X, represent a random sample of service times of n customers at 
a certain facility, where the underlying distribution is assumed exponential with 
parameter A. Since there is only one parameter to be estimated, the estimator is 
obtained by equating E(X) to X. Since E(X) = 1/A for an exponential distribution, 
this gives 1/A = X or = 1/X. The moment estimator of A is thenX = 1/X. 


EXAMPLE 6.13 Let X,,..., X,, be a random sample from a gamma distribution with parameters a and 
B. From Section 4.4, E(X) = aB and E(X”) = B*T(a + 2)/T(a) = Ba + Ia. 
The moment estimators of a and B are obtained by solving 


X = af “Sx? = a(a + 1p? 


Since a(a + 1)B? = a?B? + a? and the first equation implies «2B? = X”, the sec- 
ond equation becomes 


1 = 
non He rar 


Now dividing each side of this second equation by the corresponding side of the first 
equation and substituting back gives the estimators 


: x? » _ (1/n)>)X? - X? 
a= = = = 
(1/n) SX? — X? xX 
To illustrate, the survival-time data mentioned in Example 4.24 is 


152. 115 109 94 88 137 152 77 160 165 
125 40 128 123 136 101 62 153 83 69 


from which x = 113.5 and (1/20)=x? = 14,087.8. The parameter estimates are 


(113.5)? ~  14,087.8 — (113.5) 
a= =107 B= = 
14,087.8 — (113.5)? 113.5 


10.6 


These estimates of a and 6 differ from the values suggested by Gross and Clark 
because they used a different estimation technique. o 
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EXAMPLE 6.14 Let X,,...,X, be a random sample from a generalized negative binomial 
distribution with parameters r and p (see Section 3.5). Since E(X) = r(1 — p)/p 
and V(X)= r(1 — p)/p®, E(X®) = V(X) + [E(X)P = (1 = pr — rp + 1)/p*. 
Equating E(X) to X and E(X?) to (1/n)=X? eventually gives 


4 3 x? 
= = p= —— 
e djs eax dj > x7 = =x 
As an illustration, Reep, Pollard, and Benjamin (“Skill and Chance in Ball 
Games,” J. of Royal Stat. Soc., 1971: 623-629) consider the negative binomial dis- 


tribution as a model for the number of goals per game scored by National Hockey 
League teams. The data for 1966-1967 follows (420 games): 


Goals | 0 1 2 3 4 2) 6 7 8 9 10 


Frequency | 29 71 82 89 6 45 #24 7 #4 «1 3 


Then, 

x= S'x,/420 = [(0)(29) + (1)(71) + --+ + 10)(3)]/420 = 2.98 
and 

S'x?/420 = [(0)?(29) + (1)2(71) + --- + (10)2(3)]/420 = 12.49 
Thus, 


4 2.98 a (2.98) 
p= = .85 r= z = 
12.40 — (2.98) 12.40 — (2.98)* — 2.98 


16.5 


Although r by definition must be positive, the denominator of 7 could be negative, 
indicating that the negative binomial distribution is not appropriate (or that the 
moment estimator is flawed). is 


Maximum Likelihood Estimation 


The method of maximum likelihood was first introduced by R. A. Fisher, a geneticist 
and statistician, in the 1920s. Most statisticians recommend this method, at least 
when the sample size is large, since the resulting estimators have certain desirable 
efficiency properties (see the proposition on page 271). 


EXAMPLE 6.15 The best protection against hacking into an online account is to use a password that 
has at least 8 characters consisting of upper- and lowercase letters, numerals, and spe- 
cial characters. [Note: The Jan. 2012 issue of Consumer Reports reported that only 
25% of individuals surveyed used a strong password.] Suppose that 10 individuals 
who have email accounts with a certain provider are selected, and it is found that the 
first, third, and tenth individuals have such strong protection, whereas the others do 
not. Let p = P(strong protection), i.e., p is the proportion of all such account holders 
having strong protection. Define (Bernoulli) random variables X,, X,,..., Xj) by 


' ( if 1st does not have strong protection’ ” 


1 if 1st has strong protection ( if 10th has strong protection 
10 


Oif 10th does not have strong protection 


Then for the obtained sample, X, = X,; = X,, = | and the other seven X;’s are all 
zero. The probability mass function of any particular X, is p*(1 — p)!*, which 
becomes p if x, = 1 and 1 — p when x; = 0. Now suppose that the conditions of 
various passwords are independent of one another. This implies that the X,’s are 
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independent, so their joint probability mass function is the product of the individual 
pmf’s. Thus the joint pmf evaluated at the observed X;’s is 


F%1s-++5 %193 P) = PUL — p)p-p = pr — p)’ (6.4) 
Suppose that p = .25. Then the probability of observing the sample that we actually 
obtained is (.25)°(.75)’ = .002086. If instead p = .50, then this probability is 
(.50)°(.50)’ = .000977. For what value of p is the obtained sample most likely to 
have occurred? That is, for what value of p is the joint pmf (6.4) as large as it can 
be? What value of p maximizes (6.4)? Figure 6.6(a) shows a graph of the likelihood 
(6.4) as a function of p. It appears that the graph reaches its peak above p = .3 = the 
proportion of flawed helmets in the sample. Figure 6.6(b) shows a graph of the nat- 
ural logarithm of (6.4); since In[g(u)] is a strictly increasing function of g(u), find- 
ing u to maximize the function g(u) is the same as finding u to maximize In[g(w)]. 


Likelihood In(likelihood) 
A A 
0.0025 4 
0) ot 
0.0020 4 
-10 A 
0.0015 - i 
20 4 
0.0010 4 i 
-30 4 i 
0.0005 4 
—40 5 
0.0000 4 : 
T T T T T > p 350+ T T T T > P 
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 


Figure 6.6 (a) Graph of the likelihood (joint pmf) (6.4) from Example 6.15 (b) Graph of the 
natural logarithm of the likelihood 


We can verify our visual impression by using calculus to find the value of p that 
maximizes (6.4). Working with the natural log of the joint pmf is often easier than 
working with the joint pmf itself, since the joint pmf is typically a product so its log- 
arithm will be a sum. Here 


In[f(X,,..+5 193 P)] = In[p>(1 = py’) = 3In(p) + 7Ind — p) (6.5) 
Thus 
d d 3 7 
dp UN FG ho P= oo” + 7in(1 — p)} = ? + = “ 1) 
= see 
Pp 1l-p 


[the (— 1) comes from the chain rule in calculus]. Equating this derivative to 0 and 
solving for p gives 3(1 — p) = 7p, from which 3 = 10p and so p = 3/10 = .30 as 
conjectured. That is, our point estimate is p = .30. It is called the maximum like- 
lihood estimate because it is the parameter value that maximizes the likelihood 
(joint pmf) of the observed sample. In general, the second derivative should be 
examined to make sure a maximum has been obtained, but here this is obvious 
from Figure 6.5. 
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Suppose that rather than being told the condition of every password, 
we had only been informed that three of the ten were strong. Then we would 
have the observed value of a binomial random variable X = the number with 
strong passwords. The pmf of X is (2) p‘(1 — p)'°*. For x = 3, this becomes 
lie ) p’(1 — p)’. The binomial coefficient (2) is irrelevant to the maximization, so 
again p = .30. es 


DEFINITION Let X,, X,,..., X,, have joint pmf or pdf 
SOG ER rere Ones seal Oe) (6.6) 


where the parameters 0,,..., 0,, have unknown values. When x,,..., x, are the 
observed sample values and (6.6) is regarded as a function of 6;,..., 0,,, itis called 


the likelihood function. The maximum likelihood estimates (mle’s) 6,,..., 6, are 
those values of the 0,’s that maximize the likelihood function, so that 


Ga Olt en0 =O eae Oc 6, tor all, 8, 


When the X,’s are substituted in place of the x,’s, the maximum likelihood 
estimators result. 


The likelihood function tells us how likely the observed sample is as a function 
of the possible parameter values. Maximizing the likelihood gives the parameter val- 
ues for which the observed sample is most likely to have been generated—that is, the 
parameter values that “agree most closely” with the observed data. 


EXAMPLE 6.16 Suppose X,, X,,..., X, is a random sample from an exponential distribution with 
parameter A. Because of independence, the likelihood function is a product of the 
individual pdf’s: 

fy 0 HA) = ACMI oo (Ae Mn) = Ne 
The natural logarithm of the likelihood function is 
In[ f(x),.--,%,3 A)] = 2 nA) — AS x, 


Equating (d/ dX)[In(likelihood)] to zeroresults inn /X — Xx; = 0,ord = n/ x; = 1/%. 
Thus the mle is A = 1/X; it is identical to the method of moments estimator [but it 
is not an unbiased estimator, since E(1/X)  1/E(X)]. a 


EXAMPLE 6.17 Let X,,..., X, be a random sample from a normal distribution. The likelihood 
function is 


1 ree 1 Ses 
f(x sedgnder fly a”) = ——__ g- @)— »)?/(20) . 5. «§ —____ 9, — w)"/o) 
. 210” \V/ 2110” 
n/2 
= : ) e726%:— wP/ 20) 
210” 
Ne) 
n 1 
In[f(x,,....X_3 Hs O7)) = — 5 In (2770?) — 5a SG;= er 
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To find the maximizing values of x and 0”, we must take the partial derivatives of 
In(f) with respect to w and o”, equate them to zero, and solve the resulting two equa- 
tions. Omitting the details, the resulting mle’s are 


p=xX & DX =X 


7 n 
The mle of o? is not the unbiased estimator, so two different principles of estimation 
(unbiasedness and maximum likelihood) yield two different estimators. | 


EXAMPLE 6.18 In Chapter 3, we mentioned the use of the Poisson distribution for modeling the 
number of “events” that occur in a two-dimensional region. Assume that when the 
region R being sampled has area a(R), the number X of events occurring in R has a 
Poisson distribution with parameter Aa(R) (where A is the expected number of events 
per unit area) and that nonoverlapping regions yield independent X’s. 
Suppose an ecologist selects n nonoverlapping regions R,,..., R, and counts 
the number of plants of a certain species found in each region. The joint pmf (like- 
lihood) is then 


[A . a(R, Pie hak [A a(R,) Pre * 48) 
POX pp X 3A) = ; gadis | 
xX: x 
= [a(R steers [a(R,, )P* © AP + ETAZ AR) 
x! eee x,,! 


The log likelihood is 
In[ p(,,...,%,35 ] = Dix; - Infa(R)] + INA) - Sx, — AS a(R,) — SnG;!) 
Taking d/dA [In(p)] and equating it to zero yields 


yi = 


A 


Sak) = 0 


from which 


Di 


Sak) 


The mle is then A = >=X,/Za(R;). This is intuitively reasonable because A is the true 
density (plants per unit area), whereas d is the sample density since Ya(R,) is just the 
total area sampled. Because E(X;) = A - a(R;), the estimator is unbiased. 
Sometimes an alternative sampling procedure is used. Instead of fixing 
regions to be sampled, the ecologist will select 1 points in the entire region of inter- 
est and let y, = the distance from the ith point to the nearest plant. The cumulative 
distribution function (cdf) of Y = distance to the nearest plant is 
no plants in a 
B= ES ee ne circle of radius y 


er Ty’)? 


0! 


1- eka 


Taking the derivative of F,(y) with respect to y yields 


2mAye*” yz=0 


0 otherwise 


AOA) -| 
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If we now form the likelihood f,(y,; A) -...- f,(y,3 A), differentiate In(likelihood), 
and so on, the resulting mle is 


5 n _ number of plants observed 


7 a> Y? total area sampled 


which is also a sample density. It can be shown that in a sparse environment (small A), 
the distance method is in a certain sense better, whereas in a dense environment the 
first sampling method is better. ia 


EXAMPLE 6.19 Let Xj,..., X,, be a random sample from a Weibull pdf 


i 
x; a, B) = 


0 otherwise 


Writing the likelihood and In(likelihood), then setting both (d/da)[In(f)] = 0 and 
(0/dB)UIn(f)] = 0, yields the equations 


Sein Ymeayp'  — (Sae\"" 
(6 ds Sixe n B ~ n 


These two equations cannot be solved explicitly to give general formulas for 
the mle’s @ and B. Instead, for each sample x,,..., x,,, the equations must be solved 
using an iterative numerical procedure. The R, SAS and Minitab software packages 
can be used for this purpose. Even moment estimators of a and B are somewhat 
complicated (see Exercise 21). @ 


Estimating Functions of Parameters 


Once the mle for a parameter 0 is available, the mle for any function of 6, such as 
1/0 or V9, is easily obtained. 


PROPOSITION The Invariance Principle 
Let 6, 6, aoe 6, be the mle’s of the parameters 6,, 0,,..., 0,,. Then the mle of 


any function h(0,, 0,,..., 8,,) of these parameters is the function ho ie 6, ees 6,,) 
of the mle’s. 


EXAMPLE 6.20 In the normal case, the mle’s of js and o? are f& = X and 6? = X(X, — X)°/n. To 
(Example 6.17 obtain the mle of the function h(j, 0?) = Vo? = a, substitute the mle’s into the 
continued) function: 


; = = , 1/2 
6=VC= Xi xy 


The mle of o is not the sample standard deviation S, though they are close unless n 
is quite small. a 


EXAMPLE 6.21 The mean value of an rv X that has a Weibull distribution is 
(Example 6.19 
continued) w=B-Td + 1/a) 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


6.2 Methods of Point Estimation 271 


The mle of pz is therefore { = BI'(1 + 1/&), where & and f are the mle’s of a and 
B. In particular, X is not the mle of jw, though it is an unbiased estimator. At least 
for large n, ji is a better estimator than X. 

For the data given in Example 6.3, the mle’s of the Weibull parameters are 
a@ = 11.9731 and = 77.0153, from which {2 = 73.80. This estimate is quite close 
to the sample mean 73.88. a 


Large Sample Behavior of the MLE 


Although the principle of maximum likelihood estimation has considerable intuitive 
appeal, the following proposition provides additional rationale for the use of mle’s. 


PROPOSITION Under very general conditions on the joint distribution of the sample, when the 
sample size n is large, the maximum likelihood estimator of any parameter 0 is 
at least approximately unbiased [E(6) = @]| and has variance that is either as 
small as or nearly as small as can be achieved by any estimator. Stated another 
way, the mle 6 is either exactly or at least approximately the MVUE of 0. 


Because of this result and the fact that calculus-based techniques can usually be used to 
derive the mle’s (though often numerical methods, such as Newton’s method, are nec- 
essary), maximum likelihood estimation is the most widely used estimation technique 
among statisticians. Many of the estimators used in the remainder of the book are mle’s. 
Obtaining an mle, however, does require that the underlying distribution be specified. 


Some Complications 


Sometimes calculus cannot be used to obtain mle’s. 


EXAMPLE 6.22 Suppose waiting time for a bus is uniformly distributed on [0, 6] and the results 
X,,..., x, Of a random sample from this distribution have been observed. Since 
f(x; 0) = 1/6 for 0 = x S 6 and 0 otherwise, 


FQy---.X,3 8) = Q” 
0 otherwise 


As long as max(x;) = 6, the likelihood is 1/6”, which is positive, but as soon as 
@ < max(x;), the likelihood drops to 0. This is illustrated in Figure 6.7. Calculus will 
not work because the maximum of the likelihood occurs at a point of discontinuity, 
but the figure shows that 6= max(X,). Thus if my waiting times are 2.3, 3.7, 1.5, .4, 
and 3.2, then the mle is 6 = 3.7. From Example 6.4, the mle is not unbiased. 


Likelihood 


max(x;) 6 


Figure 6.7 The likelihood function for Example 6.22 a 
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EXAMPLE 6.23 A method that is often used to estimate the size of a wildlife population involves per- 
forming a capture/recapture experiment. In this experiment, an initial sample of M 
animals is captured, each of these animals is tagged, and the animals are then 
returned to the population. After allowing enough time for the tagged individuals to 
mix into the population, another sample of size n is captured. With X = the number 
of tagged animals in the second sample, the objective is to use the observed x to 
estimate the population size N. 

The parameter of interest is 6 = N, which can assume only integer values, so 
even after determining the likelihood function (pmf of X here), using calculus to 
obtain N would present difficulties. If we think of a success as a previously tagged 
animal being recaptured, then sampling is without replacement from a population 
containing M successes and N — M failures, so that X is a hypergeometric rv and the 


likelihood function is 
M N-M 
x n-x 


P(x; N) = h(x; n, M, N) = 


The integer-valued nature of N notwithstanding, it would be difficult to take 
the derivative of p(x; N). However, if we consider the ratio of p(x; N) to p(x; N — 1), 
we have 


PQXxsN)  _ (N—M):-(N—n) 
pa;sN-1) MN-M-n-+x) 


This ratio is larger than 1 if and only if (iff) N < Mn/x. The value of N for which 
p(x; N) is maximized is therefore the largest integer less than Mn/x. If we use stand- 
ard mathematical notation [r] for the largest integer less than or equal to r, the mle 
of Nis N= [Mn /x]. As an illustration, if M = 200 fish are taken from a lake and 
tagged, and subsequently n = 100 fish are recaptured, and among the 100 there are 
x = 11 tagged fish, then N = [(200)(100)/11] = [1818.18] = 1818. The estimate is 
actually rather intuitive; x/n is the proportion of the recaptured sample that is tagged, 
whereas M/N is the proportion of the entire population that is tagged. The estimate is 
obtained by equating these two proportions (estimating a population proportion by a 
sample proportion). a 


Suppose X,, X,,..., X,, is arandom sample from a pdf f(x; 0) that is symmetric 
about 6 but that the investigator is unsure of the form of the f function. It is then 
desirable to use an estimator 6 that is robust—that is, one that performs well for a 
wide variety of underlying pdf’s. One such estimator is a trimmed mean. In recent 
years, statisticians have proposed another type of estimator, called an M-estimator, 
based on a generalization of maximum likelihood estimation. Instead of maximiz- 
ing the log likelihood =In[ f(x; 0)] for a specified f, one maximizes p(x;; 0). The 
“objective function” p is selected to yield an estimator with good robustness prop- 
erties. The book by David Hoaglin et al. (see the bibliography) contains a good 
exposition of this topic. 
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EXERCISES Section 6.2 (20-30) 


20. 


21. 


22. 


23. 


A diagnostic test for a certain disease is applied to n 

individuals known to not have the disease. Let X = the 

number among the v7 test results that are positive (indicat- 

ing presence of the disease, so X is the number of false 

positives) and p = the probability that a disease-free 

individual’s test result is positive (i.e., p is the true pro- 

portion of test results from disease-free individuals that 

are positive). Assume that only X is available rather than 

the actual sequence of test results. 

a. Derive the maximum likelihood estimator of p. If 
n = 20 and x = 3, what is the estimate? 

. Is the estimator of part (a) unbiased? 

ec. Ifn = 20 and x = 3, what is the mle of the probabil- 
ity (1 — p) that none of the next five tests done on 
disease-free individuals are positive? 


Let X have a Weibull distribution with parameters a 
and B, so 


E(X) = 8 -TU + 1/a) 
V(X) = BTC + 2/a) — (TU + 1/e)P} 


a. Based on arandom sample X),..., X,,, write equations 
for the method of moments estimators of B and a. 
Show that, once the estimate of a has been obtained, 
the estimate of B can be found from a table of the 
gamma function and that the estimate of a is the 
solution to a complicated equation involving the 
gamma function. 

b. If n = 20, x = 28.0, and Xx? = 16,500, compute the 
estimates. [Hint: [[(1.2)]?/T..4) = .95.] 


Let X denote the proportion of allotted time that a ran- 
domly selected student spends working on a certain 
aptitude test. Suppose the pdf of X is 


eae (@+)x® OSx<=1 

ae 0 otherwise 

where —1 < 9. A random sample of ten students yields 
data x, = .92, x =.79, x3 = .90, x, = .65, x5 = .86, 
Xe = AT, X7 = .73, Xp = 97, Xy = .94, X19 = .77. 


a. Use the method of moments to obtain an estimator of 
6, and then compute the estimate for this data. 

b. Obtain the maximum likelihood estimator of 6, and 
then compute the estimate for the given data. 


Let X represent the error in making a measurement of a 
physical characteristic or property (e.g., the boiling point 
of a particular liquid). It is often reasonable to assume 
that E(X) = 0 and that X has a normal distribution. Thus 
the pdf of any particular measurement error is 

e78/20 


f0s 0) = -a<x<o 


24. 


25. 


26. 


27. 


(where we have used @ in place of 07). Now suppose 
that n independent measurements are made, resulting in 
measurement errors X, = X,, Xj) = X,..., X,, = X,. Obtain 
the mle of 0. 


A vehicle with a particular defect in its emission control 
system is taken to a succession of randomly selected 
mechanics until r = 3 of them have correctly diagnosed 
the problem. Suppose that this requires diagnoses by 20 
different mechanics (so there were 17 incorrect diagno- 
ses). Let p = P(correct diagnosis), so p is the proportion 
of all mechanics who would correctly diagnose the prob- 
lem. What is the mle of p? Is it the same as the mle if a 
random sample of 20 mechanics results in 3 correct diag- 
noses? Explain. How does the mle compare to the esti- 
mate resulting from the use of the unbiased estimator 
given in Exercise 17? 


The shear strength of each of ten test spot welds is deter- 
mined, yielding the following data (psi): 


392 376 401 367 389 362 409 415 358 375 

a. Assuming that shear strength is normally distributed, 
estimate the true average shear strength and standard 
deviation of shear strength using the method of 
maximum likelihood. 

b. Again assuming a normal distribution, estimate the 
strength value below which 95% of all welds will have 
their strengths. [Hint: What is the 95th percentile in 
terms of jz and 0? Now use the invariance principle.] 

c. Suppose we decide to examine another test spot 
weld. Let X = shear strength of the weld. Use the 
given data to obtain the mle of P(X = 400). [Hint: 
P(X = 400) = &((400 — p)/o).] 


Consider randomly selecting n segments of pipe and 
determining the corrosion loss (mm) in the wall thickness 
for each one. Denote these corrosion losses by Yj, ... , Y,,. 
The article “A Probabilistic Model for a Gas Explosion 
Due to Leakages in the Grey Cast Iron Gas Mains” 
(Reliability Engr. and System Safety (2013:270-279) 
proposes a linear corrosion model: Y; = ¢,R, where f; is the 
age of the pipe and R, the corrosion rate, is exponentially 
distributed with parameter A. Obtain the maximum likeli- 
hood estimator of the exponential parameter (the result- 
ing mle appears in the cited article). [Hint: If c > 0 and 
X has an exponential distribution, so does cX.] 


Let X,,..., X,, be a random sample from a gamma distri- 

bution with parameters a and B. 

a. Derive the equations whose solutions yield the 
maximum likelihood estimators of a and B. Do you 
think they can be solved explicitly? 

b. Show that the mle of = aB is fp = X. 
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28. 


29. 


Let X,, X,,..., X,, represent a random sample from the 

Rayleigh distribution with density function given in 

Exercise 15. Determine 

a. The maximum likelihood estimator of 6, and then 
calculate the estimate for the vibratory stress data 
given in that exercise. Is this estimator the same as 
the unbiased estimator suggested in Exercise 15? 

b. The mle of the median of the vibratory stress distribu- 
tion. [Hint: First express the median in terms of 0.] 


Consider a random sample X,, X,,..., X,, from the shifted 
exponential pdf 


Aer) x=0 


0 otherwise 


f(x; A, 0) = | 


Taking 6 = 0 gives the pdf of the exponential distribu- 
tion considered previously (with positive density to the 
right of zero). An example of the shifted exponential dis- 
tribution appeared in Example 4.5, in which the variable 


30. 


of interest was time headway in traffic flow and 6 = .5 

was the minimum possible time headway. 

a. Obtain the maximum likelihood estimators of 6 and A. 

b. If n = 10 time headway observations are made, 
resulting in the values 3.11, .64, 2.55, 2.20, 5.44, 
3.42, 10.39, 8.93, 17.82, and 1.30, calculate the 
estimates of @ and X. 


At time ¢ = 0, 20 identical components are tested. The 
lifetime distribution of each is exponential with parame- 
ter A. The experimenter then leaves the test facility 
unmonitored. On his return 24 hours later, the experi- 
menter immediately terminates the test after noticing that 
y = 15 of the 20 components are still in operation (so 5 
have failed). Derive the mle of A. [Hint: Let Y = the 
number that survive 24 hours. Then Y ~ Bin(n, p). What 
is the mle of p? Now notice that p = P(X; = 24), where 
X, is exponentially distributed. This relates A to p, so the 
former can be estimated once the latter has been.] 


SUPPLEMENTARY EXERCISES (31-38) 


31. 


32. 


33. 


An estimator 6 is said to be consistent if for any € > 0, 
P(\6 6| = €) ~ 0 asn—~. That is, 6 is consistent if, 
as the sample size gets larger, it is less and less likely that 
6 will be further than € from the true value of 6. Show 
that X is a consistent estimator of « when 0? < © 
by using Chebyshev’s inequality from Exercise 44 of 
Chapter 3. [Hint: The inequality can be rewritten in the 


form 


PAY — pyl =) < 03/e 


Now identify Y with X.] 


a. Let X,,..., X, be a random sample from a uniform 
distribution on [0, 6]. Then the mle of @ is 
6=Y= max(X,). Use the fact that Y= y iff each 
X, = y to derive the cdf of Y. Then show that the pdf 
of Y = max(X;) is 


ny" = 
7 VSeyee 
froy=) 9 
0 otherwise 


b. Use the result of part (a) to show that the mle is 
biased but that (n + 1)max(X;,)/n is unbiased. 


At time ¢ = 0, there is one individual alive in a certain 
population. A pure birth process then unfolds as fol- 
lows. The time until the first birth is exponentially distrib- 
uted with parameter A. After the first birth, there are two 
individuals alive. The time until the first gives birth again 
is exponential with parameter A, and similarly for the 
second individual. Therefore, the time until the next birth 
is the minimum of two exponential (A) variables, which is 
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34. 


35. 


exponential with parameter 2A. Similarly, once the sec- 
ond birth has occurred, there are three individuals alive, 
so the time until the next birth is an exponential rv with 
parameter 3A, and so on (the memoryless property of the 
exponential distribution is being used here). Suppose the 
process is observed until the sixth birth has occurred and 
the successive birth times are 25.2, 41.7, 51.2, 55.5, 59.5, 
61.8 (from which you should calculate the times between 
successive births). Derive the mle of A. [Hint: The likeli- 
hood is a product of exponential terms.] 


The mean squared error of an estimator 6 is MSE 
(6) = E(6 — 6». If 6 is unbiased, then MSE(@) = V(@), 
but in general MSE(@) = V(6) + (bias)?. Consider the 
estimator 6? = KS*, where S* = sample variance. What 
value of K minimizes the mean squared error of this estima- 
tor when the population distribution is normal? [Hint: It can 
be shown that 


E((S?))] = (n + Dot*/(n — 1) 


In general, it is difficult to find 6 to minimize MSE(6), 
which is why we look only at unbiased estimators and 
minimize V(@).] 


Let Xj,...,X, be a random sample from a pdf that is 
symmetric about jz. An estimator for pz that has been found 
to perform well for a variety of underlying distributions is 
the Hodges—Lehmann estimator. To define it, first compute 
for each i Sj and eachj = 1, 2, ..., n the pairwise average 
x, = (X; + X)) /2. Then the estimator is f& = the median 
of the X;;’s. Compute the value of this estimate using the 
data of Exercise 44 of Chapter 1. [Hint: Construct a square 


table with the x,’s listed on the left margin and on top. Then 
compute averages on and above the diagonal.] 


36. When the population distribution is normal, the statistic 
median {| X, — X css |X, - X |} /.6745 can be used to 
estimate o. This estimator is more resistant to the effects 
of outliers (observations far from the bulk of the data) 
than is the sample standard deviation. Compute both the 
corresponding point estimate and s for the data of 
Example 6.2. 


37. When the sample standard deviation S is based on a ran- 
dom sample from a normal population distribution, it can 
be shown that 


E(S) = V2/(n — DP (n/2)a/P((n — 1)/2) 
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38. Each of n specimens is to be weighed twice on the same 
scale. Let X; and Y, denote the two observed weights for 
the ith specimen. Suppose X;, and Y, are independent of 
one another, each normally distributed with mean value 
py; (the true weight of specimen /) and variance o”. 

a. Show that the maximum likelihood estimator of 07 is 
6 = >(X, — Y¥)?/(4n). (Hint: If z= (z, + %)/2, 
then 2(z, — z)? = (z, — z,)°/2.] 

b. Is the mle G? an unbiased estimator of o?? Find an 
unbiased estimator of o*. [Hint: For any rv Z, 
E(Z*) = V(Z) + [E(Z)}. Apply this to Z = X, — Y,.] 
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Statistical Intervals 


Based on a Single 
SYelaalelis 


INTRODUCTION 


A point estimate, because it is a single number, by itself provides no informa- 
tion about the precision and reliability of estimation. Consider, for example, 
using the statistic X to calculate a point estimate for the true average breaking 
strength (g) of paper towels of a certain brand, and suppose that xX = 9322.7. 
Because of sampling variability, it is virtually never the case that X = w. The 
point estimate says nothing about how close it might be to w. An alterna- 
tive to reporting a single sensible value for the parameter being estimated is 
to calculate and report an entire interval of plausible values—an interval esti- 
mate or confidence interval (Cl). A confidence interval is always calculated 
by first selecting a confidence level, which is a measure of the degree of 
reliability of the interval. A confidence interval with a 95% confidence level 
for the true average breaking strength might have a lower limit of 9162.5 
and an upper limit of 9482.9. Then at the 95% confidence level, any value 
of w between 9162.5 and 9482.9 is plausible. A confidence level of 95% 
implies that 95% of all samples would give an interval that includes pw, or 
whatever other parameter is being estimated, and only 5% of all samples 
would yield an erroneous interval. The most frequently used confidence 
levels are 95%, 99%, and 90%. The higher the confidence level, the more 
strongly we believe that the value of the parameter being estimated lies 
within the interval (an interpretation of any particular confidence level will 
be given shortly). 

Information about the precision of an interval estimate is conveyed by 
the width of the interval. If the confidence level is high and the resulting 
interval is quite narrow, our knowledge of the value of the parameter is rea- 
sonably precise. A very wide confidence interval, however, gives the message 
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that there is a great deal of uncertainty concerning the value of what we are 
estimating. Figure 7.1 shows 95% confidence intervals for true average break- 
ing strengths of two different brands of paper towels. One of these intervals 
suggests precise knowledge about w, whereas the other suggests a very wide 
range of plausible values. 


Brand 1: . . Strength 


Brand 2: - -- Strength 


Figure 7.1 Cls indicating precise (brand 1) and imprecise (brand 2) information about w 


7.1 Basic Properties of Confidence Intervals 


The basic concepts and properties of confidence intervals (CIs) are most easily intro- 
duced by first focusing on a simple, albeit somewhat unrealistic, problem situation. 
Suppose that the parameter of interest is a population mean pw and that 


1. The population distribution is normal 


2. The value of the population standard deviation o is known 


Normality of the population distribution is often a reasonable assumption. However, 
if the value of wz is unknown, it is typically implausible that the value of o would be 
available (knowledge of a population’s center typically precedes information con- 
cerning spread). We’ll develop methods based on less restrictive assumptions in 
Sections 7.2 and 7.3. 


EXAMPLE 7.1. Industrial engineers who specialize in ergonomics are concerned with designing 
workspace and worker-operated devices so as to achieve high productivity and com- 
fort. The article ‘Studies on Ergonomically Designed Alphanumeric Keyboards” 
(Human Factors, 1985: 175-187) reports on a study of preferred height for an exper- 
imental keyboard with large forearm—wrist support. A sample of n = 31 trained typ- 
ists was selected, and the preferred keyboard height was determined for each typist. 
The resulting sample average preferred height was x = 80.0 cm. Assuming that the 
preferred height is normally distributed with o = 2.0 cm (a value suggested by data 
in the article), obtain a confidence interval (interval of plausible values) for 1, the true 
average preferred height for the population of all experienced typists. a 


The actual sample observations x,, x5,..., x,, are assumed to be the result of a 
random sample X,,..., X,, from a normal distribution with mean value pw and stan- 
dard deviation a. The results described in Chapter 5 then imply that, irrespective 
of the sample size n, the sample mean X is normally distributed with expected 
value w and standard deviation a/Vn. Standardizing X by first subtracting its 
expected value and then dividing by its standard deviation yields the standard 
normal variable 


_X=p 


a/Vn 


(7.1) 
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Because the area under the standard normal curve between — 1.96 and 1.96 is .95, 


r{ -1.96 <2 —# « 1.96) = 95 (7.2) 
F ha : : : 

Now let’s manipulate the inequalities inside the parentheses in (7.2) so that 
they appear in the equivalent form / < 4 < u, where the endpoints / and u involve 
X and o/Vn. This is achieved through the following sequence of operations, each 
yielding inequalities equivalent to the original ones. 


1. Multiply through by o/Vn: 


Oo — Oo 
1.96 - <X—-— p< 1.96 -— 
Vn Vn 


2. Subtract X from each term: 


o = o 
1.96 - <-pw<—-X + 1.96 -— 
Va Vn 
3. Multiply through by —1 to eliminate the minus sign in front of j (which 
reverses the direction of each inequality): 


_ o _— o 
X+ 1.96: —=>p>xX-— 1.96 -— 
Vn Vn 
that is, 
= (on = 
X— 1.96 -—=<pw<X+ 1.96 -— 


Vn Vn 


The equivalence of each set of inequalities to the original set implies that 


x 196 2 <p <x+ 196%) = 95 (7.3) 
Vn n 
The event inside the parentheses in (7.3) has a somewhat unfamiliar appearance; 
previously, the random quantity has appeared in the middle with constants on 
both ends, as in a = Y = b. In (7.3) the random quantity appears on the two ends, 
whereas the unknown constant appears in the middle. To interpret (7.3), think 
of a random interval having left endpoint X — 1.96 - ¢//n and right endpoint 
X + 1.96 - o/Vn. In interval notation, this becomes 


[¥- 1.96 | X¥+1.96 = (7.4) 

, Vn n , 
The interval (7.4) is random because the two endpoints of the interval involve a 
random variable. It is centered at the sample mean X and extends 1.960/Vn to 
each side of X. Thus the interval’s width is 2 + (1.96) - o/ Vn, a fixed number; only 
the location of the interval (its midpoint X) is random (Figure 7.2). Now (7.3) can 
be paraphrased as “the probability is .95 that the random interval (7.4) includes or 
covers the true value of jw.’ Before any data is gathered, it is quite likely that yw will 
lie inside the interval (7.4). 


1960/V/n — 1.960/\/n- 
tt a 


T T 
X-1.960/\/n xX X + 1.960/\/n 


Figure 7.2 The random interval (7.4) centered at X 
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DEFINITION If, after observing X, = x,, X, = X),...,X, =X,, we compute the observed 
sample mean x and then substitute x into (7.4) in place of X, the resulting fixed 
interval is called a 95% confidence interval for js. This CI can be expressed 
either as 


oO 
- —,¥ + 1.96 - —]is a 95% Cl for uw 
Vn =| 


or as 


Gr: oe jie Soe sp IBS with 95% confidence 
Vn n 


A concise expression for the interval is x + 1.96 - o//n, where — gives the 
left endpoint (lower limit) and + gives the right endpoint (upper limit). 


EXAMPLE 7.2 The quantities needed for computation of the 95% CI for true average preferred 
(Example 7.1 height are 0 = 2.0, n = 31, and x = 80.0. The resulting interval is 
continued) & 2.0 
x + 1.96 - —= = 80.0 + (1.96) —= = 80.0 + .7 = (79.3, 80.7) 
Va V31 
That is, we can be highly confident, at the 95% confidence level, that 
79.3 < p< 80.7. This interval is relatively narrow, indicating that w has been 


rather precisely estimated. a 


Interpreting a Confidence Level 


The confidence level 95% for the interval just defined was inherited from the prob- 
ability .95 for the random interval (7.4). Intervals having other levels of confidence 
will be introduced shortly. For now, though, consider how 95% confidence can be 
interpreted. 

We started with an event whose probability was .95—that the random interval 
(7.4) would capture the true value of z—and then used the data in Example 7.1 to 
compute the CI (79.3, 80.7). It is therefore tempting to conclude that w is within 
this fixed interval with probability .95. But by substituting x = 80.0 for X, all ran- 
domness disappears; the interval (79.3, 80.7) is not a random interval, and w is a 
constant (unfortunately unknown to us). Thus it is incorrect to write the statement 
P(w lies in (79.3, 80.7)) = .95. 

A correct interpretation of “95% confidence” relies on the long-run relative 
frequency interpretation of probability: To say that an event A has probability .95 is 
to say that if the experiment on which A is defined is performed over and over again, 
in the long run A will occur 95% of the time. Suppose we obtain another sample of 
typists’ preferred heights and compute another 95% interval. Now consider repeat- 
ing this for a third sample, a fourth sample, a fifth sample, and so on. Let A be the 
event that X — 1.96-o0/Vn<w<X+1.96-0/Vn. Since P(A) = .95, in the 
long run 95% of our computed CIs will contain jw. This is illustrated in Figure 7.3, 
where the vertical line cuts the measurement axis at the true (but unknown) value of 
pu. Notice that 7 of the 100 intervals shown fail to contain pw. In the long run, only 
5% of the intervals so constructed would fail to contain jw. 

According to this interpretation, the confidence level 95% is not so much a 
statement about any particular interval such as (79.3, 80.7). Instead it pertains to 
what would happen if a very large number of like intervals were to be constructed 
using the same CI formula. Although this may seem unsatisfactory, the root of the 
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Bb 


Figure 7.3. One hundred 95% Cls (asterisks identify intervals that do not include jx) 


difficulty lies with our interpretation of probability—it applies to a long sequence of 
replications of an experiment rather than just a single replication. There is another 
approach to the construction and interpretation of CIs that uses the notion of subjec- 
tive probability and Bayes’ theorem, but the technical details are beyond the scope 
of this text; the book by DeGroot, et al. (see the Chapter 6 bibliography) is a good 
source. The interval presented here (as well as each interval presented subsequently) 
is called a “classical” CI because its interpretation rests on the classical notion of 
probability. 


Other Levels of Confidence 


The confidence level of 95% was inherited from the probability .95 for the initial 
inequalities in (7.2). If a confidence level of 99% is desired, the initial probability 
of .95 must be replaced by .99, which necessitates changing the z critical value from 
1.96 to 2.58. A 99% CI then results from using 2.58 in place of 1.96 in the formula 
for the 95% CI. 

In fact, any desired level of confidence can be achieved by replacing 1.96 or 
2.58 with the appropriate standard normal critical value. Recall from Chapter 4 the 
notation for a z critical value: z, is the number on the horizontal z scale that captures 
upper tail area a. As Figure 7.4 shows, a probability (i.e., central z curve area) of 
1 — ais achieved by using z, 5 in place of 1.96. 
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Z curve 


Shaded area = a/2 


~Za/2 0 Za/2 


Figure 7.4 P(—2,9<Z<2y)=1-—a 


DEFINITION A 100(1 — a)% confidence interval for the mean p of a normal population 
when the value of o is known is given by 


7 Oo _ oO 
ee tan es (Es) 


or, equivalently, by x + 2. ° o/Vn. 


The formula (7.5) for the CI can also be expressed in words as 


point estimate of w + (z critical value) (standard error of the mean). 


EXAMPLE 7.3 The production process for engine control housing units of a particular type has 
recently been modified. Prior to this modification, historical data had suggested that the 
distribution of hole diameters for bushings on the housings was normal with a standard 
deviation of .100 mm. It is believed that the modification has not affected the shape of 
the distribution or the standard deviation, but that the value of the mean diameter may 
have changed. A sample of 40 housing units is selected and hole diameter is deter- 
mined for each one, resulting in a sample mean diameter of 5.426 mm. Let’s calculate 
a confidence interval for true average hole diameter using a confidence level of 90%. 
This requires that 1001 — a) = 90, from which a = .10 and Zy/. = Zo5 = 1.645 
(corresponding to a cumulative z-curve area of .9500). The desired interval is then 


. 100 
5.426 = (1.645) —— = 5.426 + .026 = (5.400, 5.452) 
V 40 


With a reasonably high degree of confidence, we can say that 5.400 < w < 5.452. 
This interval is rather narrow because of the small amount of variability in hole 
diameter (0 = .100). | 


Confidence Level, Precision, and Sample Size 


Why settle for a confidence level of 95% when a level of 99% is achievable? 
Because the price paid for the higher confidence level is a wider interval. Since 
the 95% interval extends 1.96 - o/\/n to each side of x, the width of the interval 
is 2(1.96) - o/Vn = 3.92 - o/Vn. Similarly, the width of the 99% interval is 
2(2.58) - o/Vn = 5.16 - o/Vn. That is, we have more confidence in the 99% 
interval precisely because it is wider. The higher the desired degree of confidence, 
the wider the resulting interval will be. 

If we think of the width of the interval as specifying its precision or accuracy, 
then the confidence level (or reliability) of the interval is inversely related to its 
precision. A highly reliable interval estimate may be imprecise in that the endpoints 
of the interval may be far apart, whereas a precise interval may entail relatively low 
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reliability. Thus it cannot be said unequivocally that a 99% interval is to be preferred 
to a 95% interval; the gain in reliability entails a loss in precision. 

An appealing strategy is to specify both the desired confidence level and inter- 
val width and then determine the necessary sample size. 


EXAMPLE 7.4 Extensive monitoring of a computer time-sharing system has suggested that response 
time to a particular editing command is normally distributed with standard deviation 
25 millisec. A new operating system has been installed, and we wish to estimate the 
true average response time pw for the new environment. Assuming that response times 
are still normally distributed with o = 25, what sample size is necessary to ensure 
that the resulting 95% CI has a width of (at most) 10? The sample size n must satisfy 


10 = 2 (1.96)(25/V/n) 
Rearranging this equation gives 
Vn = 2 - (1.96)(25)/10 = 9.80 
so 
n = (9.80) = 96.04 


Since n must be an integer, a sample size of 97 is required. sj 


A general formula for the sample size n necessary to ensure an interval width 
w is obtained from equating w to 2 + z,/. ° o/\V/n and solving for n. 


The sample size necessary for the CI (7.5) to have a width w is 
Co 2 
= (22 i 2) 


The smaller the desired width w, the larger n must be. In addition, n is an increasing 
function of o (more population variability necessitates a larger sample size) and of 
the confidence level 100(1 — a)% (as @ decreases, Z,/. increases). 

The half-width 1.960/\/n of the 95% CI is sometimes called the bound on 
the error of estimation associated with a 95% confidence level. That is, with 95% 
confidence, the point estimate x will be no farther than this from js. Before obtain- 
ing data, an investigator may wish to determine a sample size for which a particular 
value of the bound is achieved. For example, with w representing the average fuel 
efficiency (mpg) for all cars of a certain type, the objective of an investigation may 
be to estimate yz to within 1 mpg with 95% confidence. More generally, if we wish 
to estimate ys to within an amount B (the specified bound on the error of estimation) 
with 100(1 — a) % confidence, the necessary sample size results from replacing 2/w 
by 1/B in the formula in the preceding box. 


Deriving a Confidence Interval 


Let X,, X,,..., X,, denote the sample on which the CI for a parameter 6 is to be based. 
Suppose a random variable satisfying the following two properties can be found: 


1. The variable is a function of both X,,..., X,, and 0. 


2. The probability distribution of the variable does not depend on @ or on any 
other unknown parameters. 
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Let h(X,, X,,...,X,3 0) denote this random variable. For example, if 
the population distribution is normal with known o and @ = y, the variable 
W(X,,...,X 3B) = X — p)/(o/ Vn) satisfies both properties; it clearly depends 
functionally on pz, yet has the standard normal probability distribution irrespective of 
the value of jw. In general, the form of the / function is usually suggested by examin- 
ing the distribution of an appropriate estimator 6. 

For any a between 0 and 1, constants a and b can be found to satisfy 


Pla < W(X,,...,X,,0) <b) =1—a (7.6) 


Because of the second property, a and b do not depend on 0. In the normal example, 
a = —Zyp and b = Z,/7. Now suppose that the inequalities in (7.6) can be manipu- 
lated to isolate 6, giving the equivalent probability statement 


P(U(X,, X5,...,X,) < 0 < W(X, X,,...,X,)) =l-—a 


Then /(x),X5,...,x,) and u(x,,...,x,) are the lower and upper confidence 
limits, respectively, for a 1001 — a)% CI. In the normal example, we saw that 
U(X,,...,X,) = X — Zap * o/Vn and u(X),...,X,) =X + zy. o/ Vn. 


EXAMPLE 7.5 A theoretical model suggests that the time to breakdown of an insulating fluid 
between electrodes at a particular voltage has an exponential distribution with 
parameter A (see Section 4.4). A random sample of n = 10 breakdown times yields 
the following sample data (in min): x, = 41.53, x, = 18.73, x; = 2.99, x, = 30.34, 
Xs = 12.33, x5 = 117.52, x, = 73.02, xg = 223.63, x5 = 4.00, x19 = 26.78. A 95% 
CI for A and for the true average breakdown time are desired. 

Let h(X,, X,,..., X34) = 2AXX;. It can be shown that this random variable 
has a probability distribution called a chi-squared distribution with 2n degrees of 
freedom (df) (v = 2n, where v is the parameter of a chi-squared distribution as 
mentioned in Section 4.4). Appendix Table A.7 pictures a typical chi-squared den- 
sity curve and tabulates critical values that capture specified tail areas. The relevant 
number of df here is 2(10) = 20. The v = 20 row of the table shows that 34.170 
captures upper-tail area .025 and 9.591 captures lower-tail area .025 (upper-tail area 
.975). Thus for n = 10, 


P(9.591 < 2ADYX, < 34.170) = .95 
Division by 2X; isolates A, yielding 
P(9.591/(25X,) <A < (34.170/(251X})) = .95 


The lower limit of the 95% CI for A is 9.591/(2=x,), and the upper limit is 
34.170/(2=x;). For the given data, =x, = 550.87, giving the interval (.00871, 
.03101). 

The expected value of an exponential rv is p = 1/A. Since 


P(2°X,/34.170 < 1/A <2>5X,/9.591) = .95 


the 95% CI for true average breakdown time is (2>x,/34.170, 22x,/9.591) = 
(32.24, 114.87). This interval is obviously quite wide, reflecting substantial vari- 
ability in breakdown times and a small sample size. ] 


In general, the upper and lower confidence limits result from replacing each 
< in (7.6) by = and solving for 0. In the insulating fluid example just considered, 
2AXx; = 34.170 gives A = 34.170/(2=x,) as the upper confidence limit, and the 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


284 CHAPTER 7 Statistical Intervals Based on a Single Sample 


lower limit is obtained from the other equation. Notice that the two interval limits are 
not equidistant from the point estimate, since the interval is not of the form 6 + c. 


Bootstrap Confidence Intervals 


The bootstrap technique was introduced in Chapter 6 as a way of estimating oy It 
can also be applied to obtain a CI for 6. Consider again estimating the mean yp of a 
normal distribution when o is known. Let’s replace jz by 6 and use 6 = Xas the point 
estimator. Notice that 1.960°Vn is the 97.5th percentile of the distribution of 6-0 
[that is, P(X — pw < 1.960/V/n) = P(Z < 1.96) = .9750]. Similarly, —1.960/V‘n is 
the 2.5th percentile, so 


.95 = P(2.5th percentile < § — @ < 97.5th percentile) 
= P(6 — 2.5th percentile > 6 > 6 — 97.5th percentile) 


That is, with 


1 = 6 — 97.5th percentile of @ — 0 


7 ; 7 (iat) 
u = 0 — 2.5th percentile of 0 — 0 


the CI for 6 is (/, u). In many cases, the percentiles in (7.7) cannot be calculated, but 
they can be estimated from bootstrap samples. Suppose we obtain B = 1000 boot- 
strap samples and calculate OF 285 Oia and 6* followed by the 1000 differences 
6* — 6%,..., 0% 9) — 6*. The 25th largest and 25th smallest of these differences are 
estimates of the unknown percentiles in (7.7). Consult the Devore and Berk or Efron 
books cited in Chapter 6 for more information. 


EXERCISES Section 7.1 (1-11) 


Consider a normal population distribution with the value 

of o known. 

a. What is the confidence level for the interval x + 
2.81a0/Vn? 

b. What is the confidence level for the interval x + 
1.440/Vn? 

c. What value of z,,. in the CI formula (7.5) results in a 
confidence level of 99.7%? 

d. Answer the question posed in part (c) for a confi- 
dence level of 75%. 


Each of the following is a confidence interval for w = 
true average (i.e., population mean) resonance frequency 
(Hz) for all tennis rackets of a certain type: 


(114.4, 115.6) (114.1, 115.9) 


a. What is the value of the sample mean resonance 
frequency? 

b. Both intervals were calculated from the same sample 
data. The confidence level for one of these intervals 
is 90% and for the other is 99%. Which of the inter- 
vals has the 90% confidence level, and why? 
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Suppose that a random sample of 50 bottles of a particular 
brand of cough syrup is selected and the alcohol content of 
each bottle is determined. Let p denote the average alco- 
hol content for the population of all bottles of the brand 
under study. Suppose that the resulting 95% confidence 
interval is (7.8, 9.4). 

a. Would a 90% confidence interval calculated from 
this same sample have been narrower or wider than 
the given interval? Explain your reasoning. 

b. Consider the following statement: There is a 95% 
chance that yz is between 7.8 and 9.4. Is this state- 
ment correct? Why or why not? 

c. Consider the following statement: We can be highly 
confident that 95% of all bottles of this type of cough 
syrup have an alcohol content that is between 7.8 and 
9.4. Is this statement correct? Why or why not? 

d. Consider the following statement: If the process of 
selecting a sample of size 50 and then computing the 
corresponding 95% interval is repeated 100 times, 95 
of the resulting intervals will include wp. Is this state- 
ment correct? Why or why not? 
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4. A Cl is desired for the true average stray-load loss w 


(watts) for a certain type of induction motor when the 

line current is held at 10 amps for a speed of 1500 rpm. 

Assume that stray-load loss is normally distributed with 

o = 3.0. 

a. Compute a 95% CI for w when n = 25 and x = 58.3. 

b. Compute a 95% CI for w when n = 100 and x = 
58.3. 

c. Compute a 99% CI for w when n = 100 and x = 
58.3. 

d. Compute an 82% CI for w when n = 100 and x = 
58.3. 

e. How large must n be if the width of the 99% interval 
for p is to be 1.0? 


Assume that the helium porosity (in percentage) of coal 
samples taken from any particular seam is normally dis- 
tributed with true standard deviation .75. 

a. Compute a 95% Cl for the true average porosity of a 
certain seam if the average porosity for 20 specimens 
from the seam was 4.85. 

b. Compute a 98% CI for true average porosity of 
another seam based on 16 specimens with a sample 
average porosity of 4.56. 

c. How large a sample size is necessary if the width of 
the 95% interval is to be .40? 

d. What sample size is necessary to estimate true aver- 
age porosity to within .2 with 99% confidence? 


On the basis of extensive tests, the yield point of a particular 

type of mild steel-reinforcing bar is known to be normally 

distributed with o = 100. The composition of bars has 
been slightly modified, but the modification is not believed 

to have affected either the normality or the value of o. 

a. Assuming this to be the case, if a sample of 25 
modified bars resulted in a sample average yield 
point of 8439 lb, compute a 90% CI for the true aver- 
age yield point of the modified bar. 

b. How would you modify the interval in part (a) to 
obtain a confidence level of 92%? 


By how much must the sample size n be increased if the 
width of the CI (7.5) is to be halved? If the sample size 
is increased by a factor of 25, what effect will this have 
on the width of the interval? Justify your assertions. 


8. 


10. 


11. 


Let a, > 0, a, > 0, with a, + a, = a. Then 


X=p 
Pl HZ 3< <4.) =1=>a 
' o/Vn ° 


a. Use this equation to derive a more general expres- 
sion for a 100(1 — a)% CI for w of which the inter- 
val (7.5) is a special case. 

b. Let a = .05 and a, = a/4, a, = 3a/4. Does this 
result in a narrower or wider interval than the interval 
(7.5)? 


a. Under the same conditions as those leading to the 
interval (7.5), P[(X — p)/(o/Vn) < 1.645] = .95. 
Use this to derive a one-sided interval for yz that has 
infinite width and provides a lower confidence bound 
on pt. What is this interval for the data in Exercise 5(a)? 

b. Generalize the result of part (a) to obtain a lower 
bound with confidence level 100(1 — a)%. 

c. What is an analogous interval to that of part (b) that 
provides an upper bound on 2? Compute this 99% 
interval for the data of Exercise 4(a). 


A random sample of n = 15 heat pumps of a certain type 
yielded the following observations on lifetime (in years): 


20 13 60 19 5.1 4 10 5.3 
15.7 .7 48 9 122 5.3 6 


a. Assume that the lifetime distribution is exponential 
and use an argument parallel to that of Example 7.5 to 
obtain a 95% CI for expected (true average) lifetime. 

b. How should the interval of part (a) be altered to 
achieve a confidence level of 99%? 

c. What is a 95% CI for the standard deviation of the 
lifetime distribution? [Hint: What is the standard 
deviation of an exponential random variable?] 


Consider the next 1000 95% Cls for w that a statistical 
consultant will obtain for various clients. Suppose the 
data sets on which the intervals are based are selected 
independently of one another. How many of these 1000 
intervals do you expect to capture the corresponding 
value of 4? What is the probability that between 940 and 
960 of these intervals contain the corresponding value of 
pe? [Hint: Let Y = the number among the 1000 intervals 
that contain pz. What kind of random variable is Y?] 


7.2 Large-Sample Confidence Intervals 


for a Population Mean and Proportion 


The CI for w given in the previous section assumed that the population distribution 
is normal with the value of o known. We now present a large-sample CI whose vali- 
dity does not require these assumptions. After showing how the argument leading to 
this interval generalizes to yield other large-sample intervals, we focus on an interval 
for a population proportion p. 
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A Large-Sample Interval for pu 


Let X,, X,,..., X,, be a random sample from a population having a mean y and stan- 
dard deviation o. Provided that n is sufficiently large, the Central Limit Theorem 
(CLT) implies that X has approximately a normal distribution whatever the nature of 
the population distribution. It then follows that Z = (X — )/(o/V/n) has approxi- 
mately a standard normal distribution, so that 


rf eo 1 
~fi5 Zyn)~l—-a@ 
2 o/Vn 2 


An argument parallel to that given in Section 7.1 yields x + z,/, ° a/V/nas a large- 
sample CI for w with a confidence level of approximately 100(1 — a)%. That is, 
when z is large, the CI for x given previously remains valid whatever the population 
distribution, provided that the qualifier “approximately” is inserted in front of the 
confidence level. 

A practical difficulty with this development is that computation of the CI requires 
the value of o, which will rarely be known. Consider replacing the population standard 
deviation o in Z by the sample standard deviation to obtain the standardized variable 

ia 

S/Vn 
Previously, there was randomness only in the numerator of Z by virtue of X. In the 
new standardized variable, both X and S vary in value from one sample to another. 
So it might seem that the distribution of the new variable should be more spread out 
than the z curve to reflect the extra variation in the denominator. This is indeed true 
when n is small. However, for large n the subsititution of S for o adds little extra 
variability, so this variable also has approximately a standard normal distribution. 
Manipulation of the variable in a probability statement, as in the case of known o, 
gives a general large-sample CI for pw. 


PROPOSITION If n is sufficiently large, the standardized variable 
ve 

Z= = 
S/Vn 


has approximately a standard normal distribution. This implies that 
s 


Va 
is a large-sample confidence interval for yz with confidence level approxi- 
mately 100(1 — a)%. This formula is valid regardless of the shape of the pop- 
ulation distribution. 


(7.8) 


Ey eras i 


In words, the CI (7.8) is 
point estimate of x + (z critical value) (estimated standard error of the mean). 


Generally speaking, n > 40 will be sufficient to justify the use of this interval. 
This is somewhat more conservative than the rule of thumb for the CLT because of 
the additional variability introduced by using S in place of o. 


EXAMPLE 7.6 Haven’t you always wanted to own a Porsche? The author thought maybe he could 
afford a Boxster, the cheapest model. So he went to www.cars.com on Nov. 18, 
2009, and found a total of 1113 such cars listed. Asking prices ranged from $3499 
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to $130,000 (the latter price was one of only two exceeding $70,000). The prices 
depressed him, so he focused instead on odometer readings (miles). Here are 
reported readings for a sample of 50 of these Boxsters: 


2948 2996 7197 8338 8500 8759 12710 12925 
15767 = =20000 §=23247 §=24863 26000 26210 30552 30600 
35700 36466 40316 40596 41021 41234 43000 44607 
45000 45027 45442 46963 47978 49518 52000 53334 
54208 56062 57000 57365 60020 60265 60803 62851 
64404. 72140 74594 79308 79500 80000 80000 84000 

113000 118634 


A boxplot of the data (Figure 7.5) shows that, except for the two outliers at the upper 
end, the distribution of values is reasonably symmetric (in fact, a normal probability 
plot exhibits a reasonably linear pattern, though the points corresponding to the two 
smallest and two largest observations are somewhat removed from a line fit through 
the remaining points). 


—Cr}— | 


| } | +> Mileage 
0 20000 40000 60000 80000 100000 120000 


Figure 7.5 A boxplot of the odometer reading data from Example 7.6 


Summary quantities include n = 50, x = 45,679.4, x = 45,013.5, s = 26,641.675, 
J, = 34,265. The mean and median are reasonably close (if the two largest values 
were each reduced by 30,000, the mean would fall to 44,479.4, while the median 
would be unaffected). The boxplot and the magnitudes of s and f, relative to the mean 
and median both indicate a substantial amount of variability. A confidence level of 
about 95% requires Z; = 1.96, and the interval is 


26,641.675 


= 45,679.4 + 7384.7 
V 50 


45,679.4 + 0.96 


= (38,294.7, 53,064.1) 


That is, 38,294.7 < w < 53,064.1 with 95% confidence. This interval is rather wide 
because a sample size of 50, even though large by our rule of thumb, is not large 
enough to overcome the substantial variability in the sample. We do not have a very 
precise estimate of the population mean odometer reading. 

Is the interval we’ve calculated one of the 95% that in the long run includes the 
parameter being estimated, or is it one of the “bad” 5% that does not do so? Without 
knowing the value of , we cannot tell. Remember that the confidence level refers to 
the long run capture percentage when the formula is used repeatedly on various sam- 
ples; it cannot be interpreted for a single sample and the resulting interval. a 


Unfortunately, the choice of sample size to yield a desired interval width is not 
as straightforward here as it was for the case of known o. This is because the width 
of (7.8) is 22, /28/ Vn. Since the value of s is not available before the data has been 
gathered, the width of the interval cannot be determined solely by the choice of n. The 
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only option for an investigator who wishes to specify a desired width is to make an 
educated guess as to what the value of s might be. By being conservative and guessing 
a larger value of s, an n larger than necessary will be chosen. The investigator may 
be able to specify a reasonably accurate value of the population range (the difference 
between the largest and smallest values). Then if the population distribution is not too 
skewed, dividing the range by 4 gives a ballpark value of what s might be. 


EXAMPLE 7.7. The charge-to-tap time (min) for carbon steel in one type of open hearth furnace is 
to be determined for each heat in a sample of size n. If the investigator believes that 
almost all times in the distribution are between 320 and 440, what sample size would 
be appropriate for estimating the true average time to within 5 min. with a confi- 
dence level of 95%? 

A reasonable value for s is (440 — 320)/4 = 30. Thus 


ey = 138.3 


Since the sample size must be an integer, n = 139 should be used. Note that esti- 
mating to within 5 min. with the specified confidence level is equivalent to a CI 
width of 10 min. a 


A General Large-Sample 
Confidence Interval 


The large-sample intervals ¥ + z,,. + o/Vn and x + z, pal Vn are special cases 
of a general large-sample CI for a parameter 0. Suppose that 6 is an estimator sat- 
isfying the following properties: (1) It has approximately a normal distribution; 
(2) it is (at least approximately) unbiased; and (3) an expression for o, the standard 
deviation (standard error) of 9, is available. For example, in the case 0 = p, ju = X 
is an unbiased estimator whose distribution is approximately normal when n is large 
and o,= 0% =0/ Vn. Standardizing 6 yields the rv Z = (@ — 0)/o%, which has 
approximately a standard normal distribution. This justifies the probability statement 


6 
7-0 ae, ca] ~l-a (7.9) 


Assume first that a4 does not involve any unknown parameters (e.g., known o in 
the case @ = y). Then replacing each < in (7.9) by = results in @ = 6+ Za’ The 
so the lower and upper confidence limits are 6 - Zy. ° Op and 6+ Zyl.’ Oa TeSpec- 
tively. Now suppose that og does not involve 6 but does involve at least one other 
unknown parameter. Let sg be the estimate of og obtained by using estimates in 
place of the unknown parameters (e.g., s/n estimates o//n). Under general 
conditions (essentially that sg be close to og for most samples), a valid CI is 
6+ Za’ 5g. The large-sample interval X + z,,) - s/n is an example. 

Finally, suppose that og does involve the unknown @. For example, we shall 
see momentarily that this is the case when @ = p, a population proportion. Then 
(6 — 6)/ O§ = Zq/2 can be difficult to solve. An approximate solution can often be 
obtained by replacing @ in o¢ by its estimate 6. This results in an estimated standard 
deviation sg, and the corresponding interval is again 6+ Zar Sp: 

In words, this CI is 


point estimate of 6 + (z critical value)(estimated standard error of the estimator) 
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A Confidence Interval for a 
Population Proportion 


Let p denote the proportion of “successes” in a population, where success identifies 
an individual or object that has a specified property (e.g., individuals who gradu- 
ated from college, computers that do not need warranty service, etc.). A random 
sample of n individuals or objects is to be selected, and X is the number of successes 
in the sample. Provided that n is small compared to the population size, X can be 
regarded as a binomial rv with E(X) = np and oy = Vnp(1 — p). Furthermore, if 
both np = 10 and ng = 10, (¢g = 1 — p), X has approximately a normal distribution. 

The natural estimator of p is p = X/n, the sample fraction of successes. Since 
Pp is just X multiplied by the constant 1/n, p also has approximately a normal distri- 
bution. As shown in Section 6.1, E(p) = p (unbiasedness) and O;=V p(l — p)/n. 
The standard deviation o; involves the unknown parameter p. Standardizing p by 
subtracting p and dividing by o; then implies that 

of Za/2 < P P <aa)~1-a 
V pC — p)/n 

Proceeding as suggested in the subsection “Deriving a Confidence Interval” 
(Section 7.1), the confidence limits result from replacing each < by = and solving 
the resulting equation for p. But whereas the equations (x — w)/(s/Vn) = t2y/2 
employed in deriving the large-sample CI for yw are linear in yp, the equations here are 
quadratic (p? appears in the numerator when both sides of each equation are squared 
to eliminate the square root). The two roots are 


pt Za/2/ 2n . Vad — p)/n+ Z4/2/4n? 
a came xa 2 
1+ Zap/n ie 1+ zip/n 


Ved — p)/n + Bp/4n? 
1+ Za/2/n 


= P + Za/2 


PROPOSITION Let p = [Pp + %./2n]/[1 + z%,./n]. Then a confidence interval for a 
population proportion p with confidence level approximately 100(1 — a) 


% is 
V pq /n + 2, p>/4n? aa 
+ 
ame : 
Pp /2 1 + 2,/n ( ) 


where g = | — p and, as before, the — in (7.10) corresponds to the lower 
confidence limit and the + to the upper confidence limit. 


This is often referred to as the score CI for p. 


If the sample size nis very large, then z”/2n is generally quite negligible (small) com- 
pared to p and z”/n is quite negligible compared to 1, from which p ~ Pp. In this case 
2’/4n? is also negligible compared to pq/n(n? is a much larger divisor than is n). As 
a result, the dominant term in the + expression is z,,.V pq/n and the score interval 
is approximately 


PB * ZanV pq/n (7.11) 
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This latter interval has the general form 6+ Z4/20% Of a large-sample interval sug- 
gested in the last subsection. The approximate CI (7.11) is the one that for decades 
has appeared in introductory statistics textbooks. It clearly has a much simpler and 
more appealing form than the score CI. So why bother with the latter? 

First of all, suppose we use Z9,; = 1.96 in the traditional formula (7.11). Then 
our nominal confidence level (the one we think we’re buying by using that z critical 
value) is approximately 95%. So before a sample is selected, the probability that the 
random interval includes the actual value of p (i.e., the coverage probability) should 
be about .95. But as Figure 7.6 shows for the case n = 100, the actual coverage 
probability for this interval can differ considerably from the nominal probability 
.95, particularly when p is not close to .5 (the graph of coverage probability versus 
p is very jagged because the underlying binomial probability distribution is discrete 
rather than continuous). This is generally speaking a deficiency of the traditional 
interval—the actual confidence level can be quite different from the nominal level 
even for reasonably large sample sizes. Recent research has shown that the score 
interval rectifies this behavior—for virtually all sample sizes and values of p, its 
actual confidence level will be quite close to the nominal level specified by the 
choice of z,,,. This is due largely to the fact that the score interval is shifted a bit 
toward .5 compared to the traditional interval. In particular, the midpoint p of the 
score interval is always a bit closer to .5 than is the midpoint p of the traditional 
interval. This is especially important when p is close to 0 or 1. 


Coverage probability 4 
0.96 5 


0.94 5 


0.92 + 


0.90 5 


0.88 - 


0.86 5 


T T T T T>P 
0 0.2 0.4 0.6 0.8 1 


Figure 7.6 Actual coverage probability for the interval (7.11) for varying values of p when 
n= 100 


In addition, the score interval can be used with nearly all sample sizes and 
parameter values. It is thus not necessary to check the conditions np = 10 and 
n(l — p) = 10 that would be required were the traditional interval employed. So 
rather than asking when n is large enough for (7.11) to yield a good approximation 
to (7.10), our recommendation is that the score CI should always be used. The slight 
additional tediousness of the computation is outweighed by the desirable properties 
of the interval. 


EXAMPLE 7.8 The article “Repeatability and Reproducibility for Pass/Fail Data” (J. of Testing 
and Eval., 1997: 151-153) reported that in n = 48 trials in a particular laboratory, 
16 resulted in ignition of a particular type of substrate by a lighted cigarette. Let p 
denote the long-run proportion of all such trials that would result in ignition. A point 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


7.2 Large-Sample Confidence Intervals for a Population Mean and Proportion 291 


estimate for p is p = 16/48 = .333. A confidence interval for p with a confidence 
level of approximately 95% is 


.333 + (1.96)?/96 \/(.333)(.667)/48 + (1.96)°/9216 
+ (1.96) 
1 + (1.96)*/48 1 + (1.96)?/48 


= .345 + .129 = (.216, .474) 


This interval is quite wide because a sample size of 48 is not at all large when esti- 
mating a proportion. 


The traditional interval is 
333 + 1.96V (.333)(.667)/48 = .333 + .133 = (.200, .466) 
These two intervals would be in much closer agreement were the sample size sub- 


stantially larger. a 


Equating the width of the CI for p to a prespecified width w gives a quadratic 
equation for the sample size n necessary to give an interval with a desired degree of 
precision. Suppressing the subscript in z,,», the solution is 


22°pq — 2w? + \/4c4pq(pq — w?) + w?z4 
n= Vv (7.12) 


w 


Neglecting the terms in the numerator involving w? gives 
42’pq 


w 


This latter expression is what results from equating the width of the traditional inter- 
val to w. 

These formulas unfortunately involve the unknown p. The most conservative 
approach is to take advantage of the fact that pg [= p(1 — p)] is maximized at 
p = 5. Thus if p = g = .5 is used in (7.12), the width will be at most w regardless 
of what value of p results from the sample. Alternatively, if the investigator believes 
strongly, based on prior information, that p S p, S .5, then p, can be used in place 
of p. A similar comment applies when p = p, = .5. 


EXAMPLE 7.9 The width of the 95% CI in Example 7.8 is .258. The value of n necessary to ensure 
a width of .10 irrespective of the value of p is 


2(1.96)?(.25) — (1.96)?(.01) + ¥/4(1.96)4(.25)(.25 — .01) + (.01)(1.96)4 
n= = 
O01 


380.3 


Thus a sample size of 381 should be used. The expression for n based on the tradi- 
tional CI gives a slightly larger value of 385. a 


One-Sided Confidence Intervals 
(Confidence Bounds) 


The confidence intervals discussed thus far give both a lower confidence bound and 
an upper confidence bound for the parameter being estimated. In some circum- 
stances, an investigator will want only one of these two types of bounds. For exam- 
ple, a psychologist may wish to calculate a 95% upper confidence bound for true 
average reaction time to a particular stimulus, or a reliability engineer may want 
only a lower confidence bound for true average lifetime of components of a certain 
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type. Because the cumulative area under the standard normal curve to the left of 
1.645 is .95, 


wm 
P < 1.645} ~ .95 
S/Vn 

Manipulating the inequality inside the parentheses to isolate 4 on one side and 
replacing rv’s by calculated values gives the inequality p > x — 1.645s/Vn; 
the expression on the right is the desired lower confidence bound. Starting with 
P(—1.645 < Z) ~ .95 and manipulating the inequality results in the upper confi- 
dence bound. A similar argument gives a one-sided bound associated with any other 
confidence level. 


PROPOSITION A large-sample upper confidence bound for yp is 


= S 
ENS ae peepee 
Vn 
and a large-sample lower confidence bound for p is 
= Ss 
pw > MS La es 


Vn 
A one-sided confidence bound for p results from replacing z,,. by z, and + 
by either + or — in the CI formula (7.10) for p. In all cases the confidence 
level is approximately 100(1 — a)%. 


EXAMPLE 7.10 Titanium and its alloys have found increasing use in aerospace and automotive appli- 
cations because of durability and high strength-to-weight ratios. However, machin- 
ing can be difficult because of low thermal conductivity. The article ‘Modeling 
and Multi-Objective Optimization of Process Parameters of Wire Electrical 
Discharge Machining Using Non-Dominated Sorting Genetic Algorithm-II 
(J. of Engr. Manuf., 2012: 1186-2001) described an investigation into different 
settings that impact wire electrical discharge machining of titanium 6-2-4-2. One 
characteristic of interest was surface roughness (wg) of the metal after machining. A 
sample of 54 surface roughness observations resulted in a sample mean roughness 
of 1.9042 and a sample standard deviation of .1455. An upper confidence bound for 
true average roughness ys with confidence level 95% requires z); = 1.645 (not the 
value Z 9; = 1.96 needed for a two-sided CI). The bound is 


(.1455) 
1.9042 + (1.645) - = 1.9042 + .0326 = 1.9368 
V 54 
Thus we estimate with a confidence level of roughly 95% that w < 1.9368. Bo 


EXERCISES Section 7.2 (12-27) 


12. The following observations are lifetimes (days) subse- 115° 181 255 418 441 461 516 739 743 789 807 
quent to diagnosis for individuals suffering from blood Ber 248) 20S ARS be 168 168" Piet cides: 1222. d25d 
“A Good f Fit A h to the Cl f 1277 1290 1357 1369 1408 1455 1478 1519 1578 1578 1599 

cancer ( OEE SS Chat BPs Loe ee 1603 1605 1696 1735 1799 1815 1852 1899 1925 1965 
Life Distributions with Unknown Age,” Quality and 


hese a. Cana confidence interval for true average lifetime be 
Reliability Engr. Intl., 2012: 761-766): 


calculated without assuming anything about the 
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13. 


14. 


15. 


16. 


7.2 Large-Sample Confidence Intervals for a Population Mean and Proportion 


nature of the lifetime distribution? Explain your rea- 
soning. [Note: A normal probability plot of the data 
exhibits a reasonably linear pattern. ] 

b. Calculate and interpret a confidence interval with a 
99% confidence level for true average lifetime. [Hint: 
x = 1191.6 and s = 506.6.] 


The article ‘““Gas Cooking, Kitchen Ventilation, and 
Exposure to Combustion Products” (Indoor Air, 
2006: 65-73) reported that for a sample of 50 kitchens 
with gas cooking appliances monitored during a one- 
week period, the sample mean CO, level (ppm) was 

654.16, and the sample standard deviation was 164.43. 

a. Calculate and interpret a 95% (two-sided) confidence 
interval for true average CO, level in the population 
of all homes from which the sample was selected. 

b. Suppose the investigators had made a rough guess of 
175 for the value of s before collecting data. What 
sample size would be necessary to obtain an interval 
width of 50 ppm for a confidence level of 95%? 


The negative effects of ambient air pollution on chil- 

dren’s lung function has been well established, but less 

research is available about the impact of indoor air pol- 
lution. The authors of “Indoor Air Pollution and Lung 

Function Growth Among Children in Four Chinese 

Cities” (Indoor Air, 2012: 3-11) investigated the rela- 

tionship between indoor air-pollution metrics and lung 

function growth among children ages 6—13 years living 
in four Chinese cities. For each subject in the study, the 
authors measured an important lung-capacity index 
known as FEV,, the forced volume (in ml) of air that is 
exhaled in 1 second. Higher FEV, values are associated 
with greater lung capacity. Among the children in the 
study, 514 came from households that used coal for 

cooking or heating or both. Their FEV, mean was 1427 

with a standard deviation of 325. (A complex statistical 

procedure was used to show that burning coal had a clear 
negative effect on mean FEV, levels.) 

a. Calculate and interpret a 95% (two-sided) confi- 
dence interval for true average FEV, level in the 
population of all children from which the sample was 
selected. Does it appear that the parameter of interest 
has been accurately estimated? 

b. Suppose the investigators had made a rough guess of 
320 for the value of s before collecting data. What 
sample size would be necessary to obtain an interval 
width of 50 ml for a confidence level of 95%? 


Determine the confidence level for each of the following 
large-sample one-sided confidence bounds: 

a. Upper bound: x + .84s/Vn 

b. Lower bound: x — 2.05s/Vn 

c. Upper bound: x + .67s/Vn 


The alternating current (AC) breakdown voltage of an 
insulating liquid indicates its dielectric strength. The 
article “Testing Practices for the AC Breakdown 
Voltage Testing of Insulation Liquids” (IEEE 
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Electrical Insulation Magazine, 1995: 21-26) gave the 
accompanying sample observations on breakdown volt- 
age (kV) of a particular circuit under certain conditions. 


62 50 53 57 41 53 55 61 59 64 50 53 64 62 SO 68 
54 55 57 50 55 50 56 55 46 55 53 54 52 47 47 55 
57 48 63 57 57 55 53 59 53 52 50 55 60 50 56 58 


17. 


18. 


a. Construct a boxplot of the data and comment on 
interesting features. 

b. Calculate and interpret a 95% CI for true average 
breakdown voltage jz. Does it appear that yw has been 
precisely estimated? Explain. 

c. Suppose the investigator believes that virtually all 
values of breakdown voltage are between 40 and 70. 
What sample size would be appropriate for the 95% 
CI to have a width of 2 kV (so that yw is estimated to 
within 1 kV with 95% confidence)? 


Exercise 1.13 gave a sample of ultimate tensile strength 
observations (ksi). Use the accompanying descriptive 
statistics output from Minitab to calculate a 99% lower 
confidence bound for true average ultimate tensile 
strength, and interpret the result. 


N Mean Median TrMean StDev SE Mean 
153 135.39 135.40 135.41 4.59 0.37 
Minimum Maximum Q1 Q3 
122,20 147.70 132.95 138.25 


The U.S. Army commissioned a study to assess how 
deeply a bullet penetrates ceramic body armor (‘‘Testing 
Body Armor Materials for Use by the U.S. Army- 
Phase III,” 2012). In the standard test, a cylindrical clay 
model is layered under the armor vest. A projectile is 
then fired, causing an indentation in the clay. The deepest 
impression in the clay is measured as an indication of 
survivability of someone wearing the armor. Here is data 
from one testing organization under particular experi- 
mental conditions; measurements (in mm) were made 
using a manually controlled digital caliper: 


22.4 23.6 24.0 24.9 25.5 25.6 
25.8 26.1 26.4 26.7 27.4 27.6 
28.3 29.0 2951 29.6 2983-7 29.8 
2949 30.0 30.4 30.5 307 30.7 
31.0 31..:0 31.4 Bi.'6 31.7 319 
31.9 32.0 32.1 32.4 32.5 32.5 
32.6 32.9 33.1 33.3 33.5 3:3..5 
33.5 33.35 33.6 33:36 33.8 33.29) 
34.1 34.2 34.6 34.6 35.0 S52 
B52 35.4 35.4 35.4 355 3567 
35.8 36.0 36.0 36.0 36.1. 36.1 
36.2 36.4 36.6 37.0 37.4 3.7.5 
37.5 38.0 38.7 38.8 39.8 41.0 
42.0 42.1 44.6 48.3 55.0 


a. Construct a boxplot of the data and comment on 
interesting features. 

b. Construct a normal probability plot. Is it plausible 
that impression depth is normally distributed? Is a 
normal distribution assumption needed in order to 
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calculate a confidence interval or bound for the true 
average depth pw using the foregoing data? Explain. 

c. Use the accompanying Minitab output as a basis for 
calculating and interpreting an upper confidence 
bound for jz with a confidence level of 99%. 


Variable Count Mean SE Mean StDev 
Depth 83 33.370 0.578 5.268 
Ql Median Q3 IQR 
30.400 33.500 36.000 5.600 


19. The article “Limited Yield Estimation for Visual Defect 
Sources” (IEEE Trans. on Semiconductor Manuf., 
1997: 17-23) reported that, in a study of a particular wafer 
inspection process, 356 dies were examined by an inspec- 
tion probe and 201 of these passed the probe. Assuming a 
stable process, calculate a 95% (two-sided) confidence 
interval for the proportion of all dies that pass the probe. 


20. TV advertising agencies face increasing challenges in 
reaching audience members because viewing TV programs 
via digital streaming is gaining in popularity. The Harris 
poll reported on November 13, 2012, that 53% of 2343 
American adults surveyed said they have watched digitally 
streamed TV programming on some type of device. 

a. Calculate and interpret a confidence interval at the 
99% confidence level for the proportion of all adult 
Americans who watched streamed programming up 
to that point in time. 

b. What sample size would be required for the width of a 
99% CI to be at most .05 irrespective of the value of p? 


21. Ina sample of 1000 randomly selected consumers who 
had opportunities to send in a rebate claim form after 
purchasing a product, 250 of these people said they never 
did so (“‘Rebates: Get What You Deserve,’ Consumer 
Reports, May 2009: 7). Reasons cited for their behavior 
included too many steps in the process, amount too small, 
missed deadline, fear of being placed on a mailing list, 
lost receipt, and doubts about receiving the money. 
Calculate an upper confidence bound at the 95% confi- 
dence level for the true proportion of such consumers who 
never apply for a rebate. Based on this bound, is there 
compelling evidence that the true proportion of such con- 
sumers is smaller than 1/3? Explain your reasoning. 


22. The technology underlying hip replacements has changed 
as these operations have become more popular (over 
250,000 in the United States in 2008). Starting in 2003, 
highly durable ceramic hips were marketed. Unfortunately, 
for too many patients the increased durability has been 
counterbalanced by an increased incidence of squeaking. 
The May 11, 2008, issue of the New York Times reported 
that in one study of 143 individuals who received ceram- 
ic hips between 2003 and 2005, 10 of the hips developed 
squeaking. 

a. Calculate a lower confidence bound at the 95% con- 
fidence level for the true proportion of such hips that 
develop squeaking. 

b. Interpret the 95% confidence level used in (a). 
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23. The Pew Forum on Religion and Public Life reported 
on Dec. 9, 2009, that in a survey of 2003 American 
adults, 25% said they believed in astrology. 

a. Calculate and interpret a confidence interval at the 
99% confidence level for the proportion of all adult 
Americans who believe in astrology. 

b. What sample size would be required for the width of 
a 99% CI to be at most .05 irrespective of the value 
of p? 


24. A sample of 56 research cotton samples resulted in a 
sample average percentage elongation of 8.17 and a 
sample standard deviation of 1.42 (‘An Apparent 
Relation Between the Spiral Angle ¢, the Percent 
Elongation E,, and the Dimensions of the Cotton 
Fiber,’ Textile Research J., 1978: 407-410). Calculate 
a 95% large-sample CI for the true average percentage 
elongation yw. What assumptions are you making about 
the distribution of percentage elongation? 


25. A state legislator wishes to survey residents of her 
district to see what proportion of the electorate is 
aware of her position on using state funds to pay for 
abortions. 

a. What sample size is necessary if the 95% CI for p is 
to have a width of at most .10 irrespective of p? 

b. If the legislator has strong reason to believe that at 
least 2/3 of the electorate know of her position, how 
large a sample size would you recommend? 


26. The superintendent of a large school district, having once 
had a course in probability and statistics, believes that 
the number of teachers absent on any given day has a 
Poisson distribution with parameter jz. Use the accompa- 
nying data on absences for 50 days to obtain a large- 
sample CI for yt. [Hint: The mean and variance of a 
Poisson variable both equal p, so 


g7Se 
VpE/n 


has approximately a standard normal distribution. Now 
proceed as in the derivation of the interval for p by making 
a probability statement (with probability 1 — q@) and solv- 
ing the resulting inequalities for 4» — see the argument just 
after (7.10).] 


Number of 
absences 0 1 2 3 4 5 6 7 8 9 10 
Frequency 1 4 8 10 8 7 5 3 2 1 «1 


27. Reconsider the CI (7.10) for p, and focus on a confidence 
level of 95%. Show that the confidence limits agree quite 
well with those of the traditional interval (7.11) once two 
successes and two failures have been appended to the 
sample [i.e., (7.11) based on x + 2 S’s inn + 4 trials]. 
[Hint: 1.96 ~ 2. Note: Agresti and Coull showed that this 
adjustment of the traditional interval also has an actual 
confidence level close to the nominal level.] 
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7.3 \ntervals Based on a Normal 


Population Distribution 


The CI for yz presented in Section 7.2 is valid provided that n is large. The resulting 
interval can be used whatever the nature of the population distribution. The CLT can- 
not be invoked, however, when n is small. In this case, one way to proceed is to make 
a specific assumption about the form of the population distribution and then derive a 
CI tailored to that assumption. For example, we could develop a CI for w when the 
population is described by a gamma distribution, another interval for the case of a 
Weibull distribution, and so on. Statisticians have indeed carried out this program for 
a number of different distributional families. Because the normal distribution is more 
frequently appropriate as a population model than is any other type of distribution, 
we will focus here on a CI for this situation. 


ASSUMPTION The population of interest is normal, so that X,,..., X, constitutes a random 
sample from a normal distribution with both w and a unknown. 


The key result underlying the interval in Section 7.2 was that for large n, the rv 
Z= (KX — p)/(S/ Vn) has approximately a standard normal distribution. When n 
is small, the additional variability in the denominator implies that the probability 
distribution of (X — y)/(S/ Vn) will be more spread out than the standard normal 
distribution. The result on which inferences are based introduces a new family of 
probability distributions called ¢ distributions. 


THEOREM When X is the mean of a random sample of size n from a normal distribution 
with mean p, the rv 


; S/Vn 


has a probability distribution called a ¢ distribution with n — 1 degrees of 
freedom (df). 


(7.13) 


Properties of t Distributions 


Before applying this theorem, a discussion of properties of ¢ distributions is in 
order. Although the variable of interest is still (X — p)/(S/ Vn), we now denote it 
by T to emphasize that it does not have a standard normal distribution when n 
is small. Recall that a normal distribution is governed by two parameters; each 
different choice of 2 in combination with o gives a particular normal distribution. 
Any particular ¢ distribution results from specifying the value of a single param- 
eter, called the number of degrees of freedom, abbreviated df. We'll denote this 
parameter by the Greek letter v. Possible values of v are the positive integers 1, 
2, 3,.... So there is a ¢ distribution with 1 df, another with 2 df, yet another with 
3 df, and so on. 
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For any fixed value of v, the density function that specifies the associated f curve 
is even more complicated than the normal density function. Fortunately, we need con- 
cern ourselves only with several of the more important features of these curves. 


Properties of t Distributions 
Let t, denote the ¢ distribution with v df. 


1. Each t, curve is bell-shaped and centered at 0. 

2. Each t, curve is more spread out than the standard normal (z) curve. 
3. As v increases, the spread of the corresponding ¢, curve decreases. 
4 


. As vy — ©, the sequence of ¢, curves approaches the standard normal curve 
(so the z curve is often called the t curve with df = ~). 


Figure 7.7 illustrates several of these properties for selected values of v. 


Z curve 


ty5 Curve 


ts curve 


I 
| 
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Figure 7.7. 4, and zcurves 


The number of df for T in (7.13) is n — 1 because, although S is based on 
the n deviations X, — X,..., X, — X, =(X; — X) = 0 implies that only n — 1 of these 
are “freely determined.” The number of df for a ¢ variable is the number of freely 
determined deviations on which the estimated standard deviation in the denominator 
of T is based. 

The use of ¢ distribution in making inferences requires notation for capturing 
t-curve tail areas analogous to z, for the z curve. You might think that tf, would do 
the trick. However, the desired value depends not only on the tail area captured but 


also on df. 
NOTATION Let t,,, = the number on the measurement axis for which the area under the 
t curve with v df to the right of t, ,, is @; ft, is called a ¢ critical value. 


For example, t 9; ¢ 1s the f critical value that captures an upper-tail area of .O5 under the 
t curve with 6 df. The general notation is illustrated in Figure 7.8. Because ¢ curves 
are symmetric about zero, —f, , captures lower-tail area a. Appendix Table A.5 gives 
t,, for selected values of a and v. This table also appears inside the back cover. The 
columns of the table correspond to different values of a. To obtain f95 ;5, go to the 
a = .05 column, look down to the v = 15 row, and read fo; ;; = 1.753. Similarly, 
tos22 = 1.717 (.05 column, v = 22 row), and ty) 57 = 2.508. 
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t, curve 
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| Shaded area = a 
1 i 
0 | 


t, 


av 


Figure 7.8 _ Illustration of a t critical value 


The values of ¢, ,, exhibit regular behavior as we move across a row or down 
a column. For fixed v, f, , increases as a decreases, since we must move farther to 
the right of zero to capture area a in the tail. For fixed a, as v is increased (i.e., as 
we look down any particular column of the f table) the value of f, , decreases. This 
is because a larger value of v implies a f distribution with smaller spread, so it is not 
necessary to go so far from zero to capture tail area a. Furthermore, 1, ,, decreases 
more slowly as v increases. Consequently, the table values are shown in increments 
of 2 between 30 df and 40 df and then jump to v = 50, 60, 120, and finally ~. 
Because t,, is the standard normal curve, the familiar z, values appear in the last row 
of the table. The rule of thumb suggested earlier for use of the large-sample CI (if 
n > 40) comes from the approximate equality of the standard normal and f distribu- 


tions for v = 40. 


The One-Sample t Confidence Interval 


The standardized variable T has a ¢ distribution with n — 1 df, and the area under 
the corresponding t density curve between —f,/7,-, and ty7,,—-; is 1 — a (area a/2 
lies in each tail), so 


ee eee <T< inte) =l-a (7.14) 


Expression (7.14) differs from expressions in previous sections in that T and f, /7,,— 
are used in place of Z and z,,, but it can be manipulated in the same manner to 
obtain a confidence interval for pw. 


PROPOSITION Let x and s be the sample mean and sample standard deviation computed from 
the results of a random sample from a normal population with mean ww. Then 
a 10001 — a)% confidence interval for p is 


= ices S 
(: es Rai al . a aie ty /2,n-1 is = alls) 


or, more compactly, ¥ + fy/2,,—-1 ° 8/Vn. 
An upper confidence bound for p is 
s 
SGT ete ea 
ape 
and replacing + by — in this latter expression gives a lower confidence 
bound for pt, both with confidence level 100(1 — a)%. 


EXAMPLE 7.11 Even as traditional markets for sweetgum lumber have declined, large section solid tim- 
bers traditionally used for construction bridges and mats have become increasingly 
scarce. The article ‘Development of Novel Industrial Laminated Planks from 
Sweetgum Lumber” (J. of Bridge Engr., 2008: 64—66) described the manufacturing 
and testing of composite beams designed to add value to low-grade sweetgum lumber. 
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Here is data on the modulus of rupture (psi; the article contained summary data expressed 


in MPa): 
6807.99 7637.06 6663.28 6165.03 6991.41 6992.23 
6981.46 7569.75 7437.88 6872.39 7663.18 6032.28 
6906.04 6617.17 6984.12 7093.71 7659.50 7378.61 
7295.54 6702.76 7440.17 8053.26 8284.75 7347.95 
7422.69 7886.87 6316.67 7713.65 7503.33 7674.99 


Figure 7.9 shows a normal probability plot from the R software. The straightness of 
the pattern in the plot provides strong support for assuming that the population dis- 
tribution of MOR is at least approximately normal. 
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Theoretical Quantiles 


Figure 7.9 A normal probability plot of the modulus of rupture data 


The sample mean and sample standard deviation are 7203.191 and 543.5400, respec- 
tively (for anyone bent on doing hand calculation, the computational burden is 
eased a bit by subtracting 6000 from each x value to obtain y,; = x; — 6000; then 
Ly; = 36,095.72 and Ly? = 51,997,668.77, from which y = 1203.191 and s, = s, 
as given). 

Let’s now calculate a confidence interval for true average MOR using a 
confidence level of 95%. The CI is based on n — | = 29 degrees of freedom, so the 
necessary f critical value is t99559 = 2.045. The interval estimate is now 


_ Ky 543.5400 
X = toy599 ° —— = 7203.191 + (2.045) - ———— 
025,29 Vn /30 


= 7203.191 + 202.938 = (7000.253, 7406.129) 


We estimate that 7000.253 < w < 7406.129 with 95% confidence. If we use the 
same formula on sample after sample, in the long run 95% of the calculated inter- 
vals will contain yz. Since the value of mw is not available, we don’t know whether 
the calculated interval is one of the “good” 95% or the “bad” 5%. Even with the 
moderately large sample size, our interval is rather wide. This is a consequence of 
the substantial amount of sample variability in MOR values. 

A lower 95% confidence bound would result from retaining only the lower 
confidence limit (the one with —) and replacing 2.045 with to; 5) = 1.699. a 
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Unfortunately, it is not easy to select n to control the width of the ¢ interval. 
This is because the width involves the unknown (before the data is collected) s and 
because n enters not only through 1/7 but also through t, ni AS a result, an 
appropriate n can be obtained only by trial and error. 

In Chapter 15, we will discuss a small-sample CI for pw that is valid pro- 
vided only that the population distribution is symmetric, a weaker assumption 
than normality. However, when the population distribution is normal, the f inter- 
val tends to be narrower than would be any other interval with the same confi- 
dence level. 


A Prediction Interval for a Single 
Future Value 


In many applications, the objective is to predict a single value of a variable to be 
observed at some future time, rather than to estimate the mean value of that variable. 


EXAMPLE 7.12 Consider the following sample of fat content (in percentage) of n = 10 randomly 
selected hot dogs (‘Sensory and Mechanical Assessment of the Quality of 
Frankfurters,” J. of Texture Studies, 1990: 395-409): 


25.2 21.3 228 17.0 298 210 255 160 20.9 19.5 


Assuming that these were selected from a normal population distribution, a 95% CI 
for (interval estimate of) the population mean fat content is 


4.134 
+ tosg * = = 21.90 + 2.262 - ——— = 21.90 + 2.96 
V/10 


tad 


Sle 


= (18.94, 24.86) 


Suppose, however, you are going to eat a single hot dog of this type and want a pre- 
diction for the resulting fat content. A point prediction, analogous to a point estimate, 
is just x = 21.90. This prediction unfortunately gives no information about reliability 
or precision. a 


The general setup is as follows: We have available a random sample 
X,, X5,..., X,, from a normal population distribution, and wish to predict the value of 
X41, 4 Single future observation (e.g., the lifetime of a single lightbulb to be pur- 
chased or the fuel efficiency of a single vehicle to be rented). A point predictor is X, 
and the resulting prediction error is X — X,,, ;. The expected value of the prediction 


error is 


E(X Xni+v = E(X) E(X,,+41) a a a 0 


Since X,,, ; is independent of X,,..., X, 


» it is independent of X, so the variance of the 
prediction error is 


= = co 1 
VX ~ Xyu) = VO) + VX.) =— + 0? = 1 + = 


The prediction error is normally distributed because it is a linear combination of 
independent, normally distributed rv’s. Thus 


X= 5, = 0 X= Xs 


Velet) feld 
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has a standard normal distribution. It can be shown that replacing o by the sample 
standard deviation S (of X,,..., X,,) results in 


Goo — SL Join win = ae 


Manipulating this T variable as T = (X — p)/(S/V/n) was manipulated in the devel- 
opment of a CI gives the following result. 


PROPOSITION A prediction interval (PI) for a single observation to be selected from a nor- 
mal population distribution is 


. a 
eta igh (7.16) 


The prediction level is 100(1 — a)%. A lower prediction bound results from 
replacing f,. by , and discarding the + part of (7.16); a similar modification 
gives an upper prediction bound. 


The interpretation of a 95% prediction level is similar to that of a 95% confidence 
level. If the interval (7.16) is calculated for sample after sample and after each cal- 
culation X,,, ; is observed, in the long run 95% of these intervals will include the 
corresponding future values. 


EXAMPLE 7.13 With n = 10, x = 21.90, s = 4.134, and to959 = 2.262, a 95% PI for the fat content 
(Example 7.12 of a single hot dog is 


continued) 
| 1 
21.90 + (2.262)(4.134) . /1 + 10 = 21.90 + 9.81 


= (12.09, 31.71) 


This interval is quite wide, indicating substantial uncertainty about fat content. 
Notice that the width of the PI is more than three times that of the CI. (| 


The error of prediction is X — X,,,, a difference between two random vari- 
ables, whereas the estimation error is X — pL, the difference between a random vari- 
able and a fixed (but unknown) value. The PI is wider than the CI because there is 
more variability in the prediction error (due to X,,,,) than in the estimation error. 
In fact, as n gets arbitrarily large, the CI shrinks to the single value yw, and the PI 
approaches p + z,/. * o. There is uncertainty about a single X value even when there 


is no need to estimate. 


Tolerance Intervals 


Consider a population of automobiles of a certain type, and suppose that under spec- 
ified conditions, fuel efficiency (mpg) has a normal distribution with uw = 30 and 
o = 2. Then since the interval from — 1.645 to 1.645 captures 90% of the area under 
the z curve, 90% of all these automobiles will have fuel efficiency values between 
pw — 1.6450 = 26.71 and w + 1.6450 = 33.29. But what if the values of w and a 
are not known? We can take a sample of size n, determine the fuel efficiencies, 
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x and s, and form the interval whose lower limit is x — 1.645s and whose upper 
limit is x + 1.645s. However, because of sampling variability in the estimates of 
and o, there is a good chance that the resulting interval will include less than 90% 
of the population values. Intuitively, to have an a priori 95% chance of the resulting 
interval including at least 90% of the population values, when x and s are used in 
place of yz and o we should also replace 1.645 by some larger number. For example, 
when n = 20, the value 2.310 is such that we can be 95% confident that the interval 
x + 2.310s will include at least 90% of the fuel efficiency values in the population. 


Let k be a number between 0 and 100. A tolerance interval for capturing at 
least k% of the values in a normal population distribution with a confidence 
level 95% has the form 


x + (tolerance critical value) - s 


Tolerance critical values for k = 90, 95, and 99 in combination with various 
sample sizes are given in Appendix Table A.6. This table also includes critical 
values for a confidence level of 99% (these values are larger than the corre- 
sponding 95% values). Replacing + by + gives an upper tolerance bound, and 
using — in place of + results in a lower tolerance bound. Critical values for 
obtaining these one-sided bounds also appear in Appendix Table A.6. 


EXAMPLE 7.14 As part of a larger project to study the behavior of stressed-skin panels, a structural 
component being used extensively in North America, the article ‘“Time-Dependent 
Bending Properties of Lumber”’ (J. of Testing and Eval., 1996: 187-193) reported 
on various mechanical properties of Scotch pine lumber specimens. Consider the 
following observations on modulus of elasticity (MPa) obtained | minute after load- 
ing in a certain configuration: 


10,490 16,620 17,300 15,480 12,970 17,260 13,400 13,900 
13,630 13,260 14,370 11,700 15,470 17,840 14,070 14,760 


There is a pronounced linear pattern in a normal probability plot of the data. 
Relevant summary quantities are n = 16, x = 14,532.5, s = 2055.67. For a confi- 
dence level of 95%, a two-sided tolerance interval for capturing at least 95% of the 
modulus of elasticity values for specimens of lumber in the population sampled uses 
the tolerance critical value of 2.903. The resulting interval is 


14,532.5 + (2.903)(2055.67) = 14,532.5 = 5967.6 = (8,564.9, 20,500. 1) 


We can be highly confident that at least 95% of all lumber specimens have modulus 
of elasticity values between 8,564.9 and 20,500. 1. 

The 95% CI for pz is (13,437.3, 15,627.7), and the 95% prediction interval for 
the modulus of elasticity of a single lumber specimen is (10,017.0, 19,048.0). Both 
the prediction interval and the tolerance interval are substantially wider than the 
confidence interval. a 


Intervals Based on Nonnormal Population 
Distributions 


The one-sample t CI for jz is robust to small or even moderate departures from nor- 
mality unless n is quite small. By this we mean that if a critical value for 95% con- 
fidence, for example, is used in calculating the interval, the actual confidence level 
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will be reasonably close to the nominal 95% level. If, however, n is small and the 
population distribution is highly nonnormal, then the actual confidence level may be 
considerably different from the one you think you are using when you obtain a par- 
ticular critical value from the ¢ table. It would certainly be distressing to believe that 
your confidence level is about 95% when in fact it was really more like 88%! The 
bootstrap technique, introduced in Section 7.1, has been found to be quite successful 
at estimating parameters in a wide variety of nonnormal situations. 

In contrast to the confidence interval, the validity of the prediction and toler- 
ance intervals described in this section is closely tied to the normality assumption. 
These latter intervals should not be used in the absence of compelling evidence for 
normality. The excellent reference Statistical Intervals, cited in the bibliography at 
the end of this chapter, discusses alternative procedures of this sort for various other 


situations. 


EXERCISES Section 7.3 (28-41) 


28. 


29. 


30. 


31. 


32. 


Determine the values of the following quantities: 


a. t 1,15 b. 1 05,25 d. 1 005,40 


losis © to540 © 
Determine the ¢ critical value(s) that will capture the 
desired f-curve area in each of the following cases: 
Central area = .95, df = 10 

Central area = .95, df = 20 

Central area = .99, df = 20 

Central area = .99, df = 50 

Upper-tail area = .01, df = 25 


Lower-tail area = .025, df = 5 


ll 


rPoao gp 


Determine the ¢ critical value for a two-sided confidence 
interval in each of the following situations: 

Confidence level = 95%, df = 10 

Confidence level = 95%, df = 15 

Confidence level = 99%, df = 15 

Confidence level = 99%, n = 5 

Confidence level = 98%, df = 24 

Confidence level = 99%, n = 38 


rmPoaogp 


Determine the f critical value for a lower or an upper 
confidence bound for each of the situations described in 
Exercise 30. 


According to the article “Fatigue Testing of Condoms” 
(Polymer Testing, 2009: 567-571), “tests currently used 
for condoms are surrogates for the challenges they face in 
use,” including a test for holes, an inflation test, a package 
seal test, and tests of dimensions and lubricant quality (all 
fertile territory for the use of statistical methodology!). The 
investigators developed a new test that adds cyclic strain to 
a level well below breakage and determines the number of 
cycles to break. A sample of 20 condoms of one particular 
type resulted in a sample mean number of 1584 and a 
sample standard deviation of 607. Calculate and interpret a 
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33. 


34. 


confidence interval at the 99% confidence level for the true 
average number of cycles to break. [Note: The article pre- 
sented the results of hypothesis tests based on the ¢ distri- 
bution; the validity of these depends on assuming normal 
population distributions. ] 


The article “Measuring and Understanding the Aging 
of Kraft Insulating Paper in Power Transformers” 
(EEE Electrical Insul. Mag., 1996: 28-34) contained 
the following observations on degree of polymerization 
for paper specimens for which viscosity times concentra- 
tion fell in a certain middle range: 


418 421 421 422 425 427 431 
434 437 439 446 447 448 453 
454. 463 465 


a. Construct a boxplot of the data and comment on any 
interesting features. 

b. Is it plausible that the given sample observations 
were selected from a normal distribution? 

c. Calculate a two-sided 95% confidence interval for true 
average degree of polymerization (as did the authors 
of the article). Does the interval suggest that 440 is a 
plausible value for true average degree of polymeriza- 
tion? What about 450? 


A sample of 14 joint specimens of a particular type gave a 

sample mean proportional limit stress of 8.48 MPa and a 

sample standard deviation of .79 MPa (‘Characterization 

of Bearing Strength Factors in Pegged Timber 

Connections,” J. of Structural Engr., 1997: 326-332). 

a. Calculate and interpret a 95% lower confidence bound 
for the true average proportional limit stress of all such 
joints. What, if any, assumptions did you make about 
the distribution of proportional limit stress? 


35. 


36. 


37. 


95, 
78 


Variable N Mean 


b. Calculate and interpret a 95% lower prediction 
bound for the proportional limit stress of a single 
joint of this type. 


Silicone implant augmentation rhinoplasty is used to 
correct congenital nose deformities. The success of the 
procedure depends on various biomechanical proper- 
ties of the human nasal periosteum and fascia. The 
article “Biomechanics in Augmentation Rhinoplasty” 

(J. of Med. Engr. and Tech., 2005: 14-17) reported 

that for a sample of 15 (newly deceased) adults, the 

mean failure strain (%) was 25.0, and the standard 

deviation was 3.5. 

a. Assuming a normal distribution for failure strain, 
estimate true average strain in a way that conveys 
information about precision and reliability. 

b. Predict the strain for a single adult in a way that 
conveys information about precision and reliability. 
How does the prediction compare to the estimate 
calculated in part (a)? 


A normal probability plot of the n = 26 observations on 

escape time given in Exercise 36 of Chapter 1 shows a 

substantial linear pattern; the sample mean and sample 

standard deviation are 370.69 and 24.36, respectively. 

a. Calculate an upper confidence bound for popula- 
tion mean escape time using a confidence level of 
95%. 

b. Calculate an upper prediction bound for the escape 
time of a single additional worker using a prediction 
level of 95%. How does this bound compare with the 
confidence bound of part (a)? 

c. Suppose that two additional workers will be chosen 
to participate in the simulated escape exercise. 
Denote their escape times by X,, and X,., and let X,,., 

denote the average of these two values. Modify the 

formula for a PI for a single x value to obtain a PI for 

Xjews and calculate a 95% two-sided interval based 


new? 


on the given escape data. 


A study of the ability of individuals to walk in a straight 
line (“Can We Really Walk Straight?” Amer. J. of 
Physical Anthro., 1992: 19-27) reported the accompa- 
nying data on cadence (strides per second) for a sample 
of n = 20 randomly selected healthy men. 

85 92 95 93 86 1.00 92 85 81 
93 93) «1.05 93 1.06 1.06 96 81 .96 


A normal probability plot gives substantial support to the 
assumption that the population distribution of cadence 
is approximately normal. A descriptive summary of the 
data from Minitab follows: 


Median TrMean StDev SEMean 


cadence 20 0.9255 0.9300 0.9261 0.08090.0181 
Variable Min Max Ql Q3 
cadence 0.7800 1.0600 0.8525 0.9600 


a. Calculate and interpret a 95% confidence interval for 
population mean cadence. 


38. 


39. 


40. 
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b. Calculate and interpret a 95% prediction interval for 
the cadence of a single individual randomly selected 
from this population. 

c. Calculate an interval that includes at least 99% of the 
cadences in the population distribution using a con- 
fidence level of 95%. 


Ultra high performance concrete (UHPC) is a rela- 
tively new construction material that is characterized by 
strong adhesive properties with other materials. The 
article ‘Adhesive Power of Ultra High Performance 
Concrete from a Thermodynamic Point of View” (J. 
of Materials in Civil Engr., 2012: 1050-1058) described 
an investigation of the intermolecular forces for UHPC 
connected to various substrates. The following work of 
adhesion measurements (in mJ/m?) for UHPC specimens 
adhered to steel appeared in the article: 

107.1 109.5 107.4 106.8 108.1 

a. Is it plausible that the given sample observations 
were selected from a normal distribution? 

b. Calculate a two-sided 95% confidence interval for 
the true average work of adhesion for UHPC adhered 
to steel. Does the interval suggest that 107 is a plau- 
sible value for the true average work of adhesion for 
UHPC adhered to steel? What about 110? 

c. Predict the resulting work of adhesion value resulting 
from a single future replication of the experiment by 
calculating a 95% prediction interval, and compare the 
width of this interval to the width of the CI from (b). 

d. Calculate an interval for which you can have a high 
degree of confidence that at least 95% of all UHPC 
specimens adhered to steel will have work of adhe- 
sion values between the limits of the interval. 


Exercise 72 of Chapter | gave the following observations 

on a receptor binding measure (adjusted distribution 

volume) for a sample of 13 healthy individuals: 23, 39, 

40, 41, 43, 47, 51, 58, 63, 66, 67, 69, 72. 

a. Is it plausible that the population distribution from 
which this sample was selected is normal? 

b. Calculate an interval for which you can be 95% con- 
fident that at least 95% of all healthy individuals in 
the population have adjusted distribution volumes 
lying between the limits of the interval. 

c. Predict the adjusted distribution volume of a single 
healthy individual by calculating a 95% prediction 
interval. How does this interval’s width compare to 
the width of the interval calculated in part (b)? 


Exercise 13 of Chapter 1 presented a sample of n = 153 
observations on ultimate tensile strength, and Exercise 17 
of the previous section gave summary quantities and 
requested a large-sample confidence interval. Because 
the sample size is large, no assumptions about the popu- 
lation distribution are required for the validity of the CI. 
a. Is any assumption about the tensile-strength distribu- 

tion required prior to calculating a lower prediction 

bound for the tensile strength of the next specimen 
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selected using the method described in this section? 20 df, the areas to the right of the values .687, .860, and 

Explain. 1.064 are .25, .20, and .15, respectively. What is the 
b. Use a statistical software package to investigate the confidence level for each of the following three confi- 

plausibility of a normal population distribution. dence intervals for the mean pw of a normal population 
c. Calculate a lower prediction bound with a prediction distribution? Which of the three intervals would you 

level of 95% for the ultimate tensile strength of the recommend be used, and why? 

next specimen selected. a. (% — .687s/V21, x + 1.725s/V21) 


41. A more extensive tabulation of ¢ critical values than what b. @ — 860s/V21, +7 1.325s/V21) 
appears in this book shows that for the ¢ distribution with c. (% — 1.064s/V21, x + 1.064s/V21) 


7.4 Confidence Intervals for the Variance 


and Standard Deviation of a Normal Population 


Although inferences concerning a population variance o? or standard deviation o are 
usually of less interest than those about a mean or proportion, there are occasions 
when such procedures are needed. In the case of a normal population distribution, 
inferences are based on the following result concerning the sample variance S?. 


THEOREM Let X,, X,,..., X, be a random sample from a normal distribution with param- 
eters pw and o*. Then the rv 


(n— 1S? _ SX, - XP 


) 2 
(Om O- 


has a chi-squared (y”) probability distribution with n — | df. 


As discussed in Sections 4.4 and 7.1, the chi-squared distribution is a continu- 
ous probability distribution with a single parameter v, called the number of degrees 
of freedom, with possible values 1, 2, 3,.... The graphs of several y” probability 
density functions (pdf’s) are illustrated in Figure 7.10. Each pdf f(x; v) is posi- 
tive only for x > 0, and each has a positive skew (stretched out upper tail), though 
the distribution moves rightward and becomes more symmetric as v increases. To 
specify inferential procedures that use the chi-squared distribution, we need notation 
analogous to that for a ¢ critical value t, ,,. 


f(s v) 


Figure 7.10 Graphs of chi-squared density functions 


NOTATION Let x2, called a chi-squared critical value, denote the number on the hori- 
zontal axis such that a of the area under the chi-squared curve with v df lies 
to the right of x2. 
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Symmetry of ¢ distributions made it necessary to tabulate only upper-tailed 
t critical values (t, ,, for small values of a). The chi-squared distribution is not sym- 
metric, so Appendix Table A.7 contains values of y2,,, both for @ near 0 and near 1, 
as illustrated in Figure 7.11(b). For example, x 5,4 = 26.119, and x%559 (the Sth 
percentile) = 10.851. 


Each shaded 
area = .O1 


ia ? density curve 


Shaded area = a 


Stw—>sE 


9,v x 
(a) (b) 


lv 


Figure 7.11 _y2_, notation illustrated 


The rv (n — 1)S/o7? satisfies the two properties on which the general method 
for obtaining a CI is based: It is a function of the parameter of interest 0°, yet its 
probability distribution (chi-squared) does not depend on this parameter. The area 
under a chi-squared curve with v df to the right of x27, is a/2, as is the area to the 
left of Yj _./2,- Thus the area captured between these two critical values is 1 — a. 
As a consequence of this and the theorem just stated, 


(n — 1)S? 


Pract < 2 < Yana-t} = ba (7.17) 


Co 
The inequalities in (7.17) are equivalent to 
-—1 S52 
cae < Oo < - 
Xa/2n-1 Xi -e/2,n-1 


(n — 1)S? 


Substituting the computed value s? into the limits gives a CI for 0, and taking square 
roots gives an interval for o. 


A 100(1 — a)% confidence interval for the variance o” of a normal pop- 
ulation has lower limit 
(n = Dia expe 
and upper limit 
(n = De Geno 


A confidence interval for o has lower and upper limits that are the square 
roots of the corresponding limits in the interval for a”. An upper or a lower 
confidence bound results from replacing a/2 with a in the corresponding limit 
of the CI. 


EXAMPLE 7.15 The accompanying data on breakdown voltage of electrically stressed circuits was 
read from a normal probability plot that appeared in the article “Damage of Flexible 
Printed Wiring Boards Associated with Lightning-Induced Voltage Surges” 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


306 CHAPTER 7 Statistical Intervals Based on a Single Sample 


(EEE Transactions on Components, Hybrids, and Manuf. Tech., 1985: 214-220). 
The straightness of the plot gave strong support to the assumption that breakdown 
voltage is approximately normally distributed. 


1470 =1510 =1690 1740 1900 2000 2030 2100 2190 
2200 2290 2380 2390 2480 2500 8 2580 =. 2700 


Let o” denote the variance of the breakdown voltage distribution. The computed 
value of the sample variance is s* = 137, 324.3, the point estimate of o*. With 
df =n—1=16, a 95% CI requires x45 16 = 6.908 and x%95 16 = 28.845. The 
interval is 


LOST BE), OUST A) | = 6170.3. 318 064d) 
28.845 6.908 es 


Taking the square root of each endpoint yields (276.0, 564.0) as the 95% CI for o. 
These intervals are quite wide, reflecting substantial variability in breakdown volt- 
age in combination with a small sample size. a 


Cls for a? and o when the population distribution is not normal can be difficult 
to obtain. For such cases, consult a knowledgeable statistician. 


EXERCISES Section 7.4 (42—46) 


42. Determine the values of the following quantities: cooling of the wire electrode core and provides an 
Ce ee b. X39 improved cutting performance. The article ‘‘High- 
@ 10s d. x25 .95 Performance Wire Electrodes for Wire Electrical- 
; ; ; ae caren 
e. X4025 ee Discharge Machining—A Review” (J. of Engr. Manuf., 


2012: 1757-1773) gave the following sample observa- 


tions on total coating layer thickness (in wm) of eight 
a. The 95th percentile of the chi-squared distribution wire electrodes used for WEDM: 


with v = 10 
b. The Sth percentile of the chi-squared distribution 


43. Determine the following: 


21 16 29 35 42 24 24 25 


with v = 10 Calculate a 99% CI for the standard deviation of the 
c. P(10.98 S y? S 36.78), where x? is a chi-squared rv coating layer thickness distribution. Is this interval valid 
with v = 22 whatever the nature of the distribution? Explain. 
d. P(x’ < 14.611 or x? > 37.652), where x” is a chi- 46. The article “Concrete Pressure on Formwork” (Mag. 
squared rv with vy = 25 of Concrete Res., 2009: 407-417) gave the following 
44. The amount of lateral expansion (mils) was determined observations on maximum concrete pressure (kN/m?): 


for a sample of n = 9 pulsed-power gas metal arc welds 
used in LNG ship containment tanks. The resulting 
sample standard deviation was s = 2.81 mils. Assuming 
normality, derive a 95% CI for o? and for o. 


33.2 41.8 37.3 40.2 36.7 39.1 36.2 41.8 
36.0 35.2 36.7 38.9 35.8 35.2 40.1 


a. Is it plausible that this sample was selected from a 


45. Wire electrical-discharge machining (WEDM) is a pro- normal population distribution? 
cess used to manufacture conductive hard metal compo- b. Calculate an upper confidence bound with confi- 
nents. It uses a continuously moving wire that serves as dence level 95% for the population standard devia- 
an electrode. Coating on the wire electrode allows for tion of maximum pressure. 
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SUPPLEMENTARY EXERCISES (47-€2) 


47. 


48. 


49. 


50. 


Example 1.11 introduced the accompanying observa- 
tions on bond strength. 


M5. 21 9.9 93 7.8 6.2 66 7.0 
13.4 17.1 9.3 5.6 5.7 54 5.2 5.1 
49 10.7 15.2 85 42 40 39 3.8 
3.6 34 206 255 13.8 12.6 13.1 89 
8.2 10.7 14.2 76 5.2 5.5 a:1 5.0 


5.2. 48 4.1 3.8 3.7 36 3.6 3.6 


a. Estimate true average bond strength in a way that 
conveys information about precision and reliability. 
[Hint: Xx; = 387.8 and =x? = 4247.08.] 

b. Calculate a 95% CI for the proportion of all such 
bonds whose strength values would exceed 10. 


The article “Distributions of Compressive Strength 

Obtained from Various Diameter Cores” (ACI 

Materials J., 2012: 597-606) described a study in which 

compressive strengths were determined for concrete 

specimens of various types, core diameters, and length- 
to-diameter ratios. For one particular type, diameter, and 

I/d ratio, the 18 tested specimens resulted in a sample 

mean compressive strength of 64.41 MPa and a sample 

standard deviation of 10.32 MPa. Normality of the com- 
pressive strength distribution was judged to be quite 
plausible. 

a. Calculate a confidence interval with confidence level 
98% for the true average compressive strength under 
these circumstances. 

b. Calculate a 98% lower prediction bound for the 
compressive strength of a single future specimen 
tested under the given circumstances. [Hint: t.) 47 = 
2.224.] 


For those of you who don’t already know, dragon boat 
racing is a competitive water sport that involves 20 pad- 
dlers propelling a boat across various race distances. It 
has become increasingly popular over the last few 
years. The article “Physiological and Physical 
Characteristics of Elite Dragon Boat Paddlers” (J. 
of Strength and Conditioning, 2013: 137-145) sum- 
marized an extensive statistical analysis of data 
obtained from a sample of 11 paddlers. It reported that 
a 95% confidence interval for true average force (N) 
during a simulated 200-m race was (60.2, 70.6). Obtain 
a 95% prediction interval for the force of a single ran- 
domly selected dragon boat paddler undergoing the 
simulated race. 


A journal article reports that a sample of size 5 was 
used as a basis for calculating a 95% CI for the true 
average natural frequency (Hz) of delaminated beams 


51. 


52. 


53. 


of a certain type. The resulting interval was (229.764, 
233.504). You decide that a confidence level of 99% is 
more appropriate than the 95% level used. What are the 
limits of the 99% interval? [Hint: Use the center of the 
interval and its width to determine x and s. | 


Unexplained respiratory symptoms reported by ath- 

letes are often incorrectly considered secondary to 

exercise-induced asthma. The article “High Prevalence 
of Exercise-Induced Laryngeal Obstruction in 

Athletes” (Medicine and Science in Sports and 

Exercise, 2013: 2030-2035) suggested that many such 

cases could instead be explained by obstruction of the 

larynx. In a sample of 88 athletes referred for an 
asthma workup, 31 were found to have the EILO 
condition. 

a. Calculate and interpret a confidence interval using a 
95% confidence level for the true proportion of all 
athletes found to have the EILO condition under 
these circumstances. 

b. What sample size is required if the desired width of 
the 95% CI is to be at most .04, irrespective of the 
sample results? 

c. Does the upper limit of the interval in (a) specify a 
95% upper confidence bound for the proportion 
being estimated? Explain. 


High concentration of the toxic element arsenic is all too 
common in groundwater. The article “Evaluation of 
Treatment Systems for the Removal of Arsenic from 
Groundwater” (Practice Periodical of Hazardous, 
Toxic, and Radioactive Waste Mgmt., 2005: 152-157) 
reported that for a sample of n = 5 water specimens 
selected for treatment by coagulation, the sample mean 
arsenic concentration was 24.3 yg/L, and the sample 
standard deviation was 4.1. The authors of the cited 
article used ¢-based methods to analyze their data, so 
hopefully had reason to believe that the distribution of 
arsenic concentration was normal. 
a. Calculate and interpret a 95% CI for true average 
arsenic concentration in all such water specimens. 
b. Calculate a 90% upper confidence bound for the 
standard deviation of the arsenic concentration 
distribution. 
c. Predict the arsenic concentration for a single water 
specimen in a way that conveys information about 
precision and reliability. 


Aphid infestation of fruit trees can be controlled either by 
spraying with pesticide or by inundation with ladybugs. 
In a particular area, four different groves of fruit trees are 
selected for experimentation. The first three groves are 
sprayed with pesticides 1, 2, and 3, respectively, and the 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


308 CHAPTER 7 Statistical Intervals Based on a Single Sample 


54. 


55. 


56. 


fourth is treated with ladybugs, with the following results 
on yield: 


Nn; = 
Number Xx; 
Treatment of Trees (Bushels/Tree) S; 
1 100 10.5 1.5 
2 90 10.0 1.3 
3 100 10.1 1.8 
4 120 10.7 1.6 


Let mw; =the true average yield (bushels/tree) after 
receiving the ith treatment. Then 


0 v(t + py + ps) — My 

measures the difference in true average yields between 
treatment with pesticides and treatment with ladybugs. 
When n,, 1, 13, and ny are all large, the estimator 6 
obtained by replacing each p, by X, is approximately 
normal. Use this to derive a large-sample 100(1 — a)% 
CI for 6, and compute the 95% interval for the given 
data. 


It is important that face masks used by firefighters be 
able to withstand high temperatures because firefighters 
commonly work in temperatures of 200—500°F. In a test 
of one type of mask, 11 of 55 masks had lenses pop out 
at 250°. Construct a 90% upper confidence bound for the 
true proportion of masks of this type whose lenses would 
pop out at 250°. 


A manufacturer of college textbooks is interested in esti- 
mating the strength of the bindings produced by a par- 
ticular binding machine. Strength can be measured by 
recording the force required to pull the pages from the 
binding. If this force is measured in pounds, how many 
books should be tested to estimate the average force 
required to break the binding to within .1 lb with 95% 
confidence? Assume that o is known to be .8. 


The accompanying data on crack initiation depth (um) 

was read from a lognormal probability plot that appeared 

in the article “Incorporating Small Fatigue Crack 

Growth in Probabilistic Life Prediction: Effect of 

Stress Ratio in Ti-6Al-2Sn-6Mo” (Indl. J. of Fatigue, 

2013: 83-95). Although the pattern in the plot was quite 

straight, a normal probability plot of the data also shows 

a reasonably linear pattern. And a boxplot indicates that 

the distribution is quite symmetric in the middle 50% of 

the data and only mildly skewed overall. It is therefore 

reasonable to estimate and predict using f intervals. 

47 S51 52 53 #56 5.8 6.3 6.7 

72 7A 77 85 89 93 10.1 11.2 

a. Estimate the true average crack initiation depth with 
a 99% CI and interpret the resulting interval. 

b. Predict the value of a single crack initiation depth by 
constructing a 99% PI. 

c. Interpret in context the meaning of 99% in (b). 
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In Example 6.8, we introduced the concept of a censored 
experiment in which n components are put on test and 
the experiment terminates as soon as r of the compo- 
nents have failed. Suppose component lifetimes are 
independent, each having an exponential distribution 
with parameter A. Let Y, denote the time at which the 
first failure occurs, Y, the time at which the second fail- 
ure occurs, and so on, so that 7.=Y, +--+ + Y,+ 
(n — r)Y, is the total accumulated lifetime at termina- 
tion. Then it can be shown that 2AT, has a chi-squared 
distribution with 2r df. Use this fact to develop a 
10011 — a)% CI formula for true average lifetime 1/A. 
Compute a 95% CI from the data in Example 6.8. 


Let X,, X5,..., X,, be a random sample from a continuous 
probability distribution having median pf (so that 
P(X; = ph) = P(X, = pb) = -5). 

a. Show that 


n—-1 
P(min (X,) < ji < max (X)) = 1 - (5) 


so that (min(x,), max(x;)) is a 100(1 — a)% confidence 
interval for 2 witha = (5)”"~ |. [Hint: The complement 
of the event {min (X,) < pz < max (X;,)}is {max (X,) S 
pe} U {min (X,) = px}. But max (X) = pw iff X, = pw 
for all i.] 

b. For each of six normal male infants, the amount of 
the amino acid alanine (mg/100 mL) was determined 
while the infants were on an isoleucine-free diet, 


resulting in the following data: 
2.84 3.54 2.80 1.44 


Compute a 97% CI for the true median amount of 
alanine for infants on such a diet (‘““The Essential 
Amino Acid Requirements of Infants,” Amer. J. of 
Nutrition, 1964: 322-330). 

c. Let x(, denote the second smallest of the x,’s and 
X(,—1) denote the second largest of the x,’s. What is 
the confidence level of the interval (x), x(,—) for 


ji? 


2.94 2.70 


Let X,, X,,...,X,, be a random sample from a uniform 
distribution on the interval [0, 6], so that 
1 
-—~ 0sx=0 
fa) = 44 


0 otherwise 


Then if Y= max (X;,), it can be shown that the rv 
U = Y/@ has density function 


nu’! O0OSus1 


0 otherwise 


Su) -| 


a. Use f,,(u) to verify that 
Y 
r{ca/2)" < a <(1- a2") =l-a 


and use this to derive a 100(1 — a)% CI for 0. 
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b. Verify that P(a!/" = Y/@ < 1) = 1 — a, and derive 61. Suppose x,,x,,...,x, are observed values resulting 
a 100(1 — a)% CI for 6 based on this probability from a random sample from a symmetric but possibly 
statement. heavy-tailed distribution. Let X and f, denote the 

c. Which of the two intervals derived previously is sample median and fourth spread, respectively. 
shorter? If my waiting time for a morning bus is Chapter 11 of Understanding Robust and Exploratory 
uniformly distributed and observed waiting times are Data Analysis (see the bibliography in Chapter 6) sug- 
x, = 4.2, x, = 3.5, x, = 1.7, x, = 1.2, and x, = 2.4, gests the following robust 95% CI for the population 
derive a 95% CI for @ by using the shorter of the two mean (point of symmetry): 
intervals. aan (sesesate t critical value ) di 

Pe : 
60. Let0 <= y<a. Thena 100(1 — a)% CI for wu when n is 1.075 Vn 
sige The value of the quantity in parentheses is 2.10 for 
Se s pee -s n = 10, 1.94 for n = 20, and 1.91 for n = 30. Compute 
% Vn’ “yn this CI for the data of Exercise 45, and compare to the t CI 
. . , . . appropriate for a normal population distribution. 

The choice y = a/2 yields the usual interval derived in ; 

Section 7.2; if y ¥ o/2, this interval is not symmetric about 62. a. Use the results of Example 7.5 to obtain a 95% lower 


x. The width of this interval is w = s(z, + Zy— ar Vn. 
Show that w is minimized for the choice y = a/2, so 
that the symmetric interval is the shortest. [Hints: (a) By 
definition of z,, ®(z,) = 1 — a, so that z, = ®1(1 — a); 
(b) the relationship between the derivative of a func- 
tion y= f(x) and the inverse function x =f "'(y) is 


(d/dy) f-'(y) = 1/f'@).] 


confidence bound for the parameter A of an exponen- 
tial distribution, and calculate the bound based on 
the data given in the example. 

b. If lifetime X has an exponential distribution, the 
probability that lifetime exceeds tis P(X > 1) = e™. 
Use the result of part (a) to obtain a 95% lower con- 
fidence bound for the probability that breakdown 
time exceeds 100 min. 
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Tests of Hypotheses 


Based on a Single 
SYelaalelis 


INTRODUCTION 


A parameter can be estimated from sample data either by a single number 
(a point estimate) or an entire interval of plausible values (a confidence inter- 
val). Frequently, however, the objective of an investigation is not to estimate a 
parameter but to decide which of two contradictory claims about the param- 
eter is correct. Methods for accomplishing this comprise the part of statistical 
inference called hypothesis testing. In this chapter, we first discuss some of the 
basic concepts and terminology in hypothesis testing and then develop decision 
procedures for the most frequently encountered testing problems based on a 
sample from a single population. 
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8.1 Hypotheses and Test Procedures 


A statistical hypothesis, or just hypothesis, is a claim or assertion either about the 
value of a single parameter (population characteristic or characteristic of a probabil- 
ity distribution), about the values of several parameters, or about the form of an entire 
probability distribution. One example of a hypothesis is the claim jw = .75, where 
pis the true average inside diameter of a certain type of PVC pipe. Another exam- 
ple is the statement p < .10, where p is the proportion of defective circuit boards 
among all circuit boards produced by a certain manufacturer. If 4, and 2, denote 
the true average breaking strengths of two different types of twine, one hypothesis 
is the assertion that ~, — , = 0, and another is the statement uw, — ww, > 5. Yet 
another example of a hypothesis is the assertion that vehicle braking distance under 
particular conditions has a normal distribution. Hypotheses of this latter sort will be 
considered in Chapter 14. In this and the next several chapters, we concentrate on 
hypotheses about parameters. 

In any hypothesis-testing problem, there are two contradictory hypotheses under 
consideration. One hypothesis might be the claim ys = .75 and the other w # .75, or 
the two contradictory statements might be p = .10 and p < .10. The objective is to 
decide, based on sample information, which of the two hypotheses is correct. There is 
a familiar analogy to this in a criminal trial. One claim is the assertion that the accused 
individual is innocent. In the U.S. judicial system, this is the claim that is initially 
believed to be true. Only in the face of strong evidence to the contrary should the jury 
reject this claim in favor of the alternative assertion that the accused is guilty. In this 
sense, the claim of innocence is the favored or protected hypothesis, and the burden of 
proof is placed on those who believe in the alternative claim. 

Similarly, in testing statistical hypotheses, the problem will be formulated so 
that one of the claims is initially favored. This initially favored claim will not be 
rejected in favor of the alternative claim unless sample evidence contradicts it and 
provides strong support for the alternative assertion. 


DEFINITION The null hypothesis, denoted by Hp, is the claim that is initially assumed to 
be true (the “‘prior belief” claim). The alternative hypothesis, denoted by H,, 
is the assertion that is contradictory to Hp. 

The null hypothesis will be rejected in favor of the alternative hypoth- 
esis only if sample evidence suggests that Hy is false. If the sample does not 
strongly contradict H,, we will continue to believe in the plausibility of the 
null hypothesis. The two possible conclusions from a hypothesis-testing analy- 
sis are then reject H, or fail to reject Ho. 


A test of hypotheses is a method for using sample data to decide whether the null 
hypothesis should be rejected. Thus we might test Hp: uw = .75 against the alterna- 
tive H,: w # .75. Only if sample data strongly suggests that js is something other 
than .75 should the null hypothesis be rejected. In the absence of such evidence, H 
should not be rejected, since it is still quite plausible. 

Sometimes an investigator does not want to accept a particular assertion unless 
and until data can provide strong support for the assertion. As an example, suppose 
a company is considering putting a new type of coating on bearings that it produces. 
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The true average wear life with the current coating is known to be 1000 hours. 
With jy denoting the true average life for the new coating, the company would not 
want to make a change unless evidence strongly suggested that ww exceeds 1000. 
An appropriate problem formulation would involve testing Hy: w = 1000 against 
H,: & > 1000. The conclusion that a change is justified is identified with H,, and 
it would take conclusive evidence to justify rejecting H, and switching to the new 
coating. 

Scientific research often involves trying to decide whether a current theory 
should be replaced by a more plausible and satisfactory explanation of the phenom- 
enon under investigation. A conservative approach is to identify the current theory 
with H) and the researcher’s alternative explanation with H,,. Rejection of the current 
theory will then occur only when evidence is much more consistent with the new 
theory. In many situations, H, is referred to as the “researcher’s hypothesis,” since it 
is the claim that the researcher would really like to validate. The word null means “of 
no value, effect, or consequence,” which suggests that H, should be identified with 
the hypothesis of no change (from current opinion), no difference, no improvement, 
and so on. Suppose, for example, that 10% of all circuit boards produced by a certain 
manufacturer during a recent period were defective. An engineer has suggested a 
change in the production process in the belief that it will result in a reduced defective 
rate. Let p denote the true proportion of defective boards resulting from the changed 
process. Then the research hypothesis, on which the burden of proof is placed, is the 
assertion that p < .10. Thus the alternative hypothesis is H,: p < .10. 

In our treatment of hypothesis testing, H) will generally be stated as an 
equality claim. If 6 denotes the parameter of interest, the null hypothesis will have 
the form H): 6 = 6), where 0, is a specified number called the null value of the 
parameter (value claimed for 6 by the null hypothesis). As an example, consider 
the circuit board situation just discussed. The suggested alternative hypothesis was 
H,: p < .10, the claim that the defective rate is reduced by the process modification. 
A natural choice of Hp in this situation is the claim that p = .10, according to which 
the new process is either no better or worse than the one currently used. We will 
instead consider Hy: p = .10 versus H,: p < .10. The rationale for using this simpli- 
fied null hypothesis is that any reasonable decision procedure for deciding between 
Hy: p = .10 and H,: p < .10 will also be reasonable for deciding between the claim 
that p = .10 and H,. The use of a simplified H, is preferred because it has certain 
technical benefits, which will be apparent shortly. 

The alternative to the null hypothesis H): 8 = 6 will look like one of the fol- 
lowing three assertions: 


1. H,: 0 > 8 (Gin which case the implicit null hypothesis is 6 = 6), 
2. H,: 8 < 4, (in which case the implicit null hypothesis is 6 = 6,), or 
3. H,: 0 # 6 


For example, let a denote the standard deviation of the distribution of inside diam- 
eters (inches) for a certain type of metal sleeve. If the decision was made to use 
the sleeve unless sample evidence conclusively demonstrated that 0 > .001, the 
appropriate hypotheses would be Hp: o = .001 versus H,: 0 > .001. The number 
0, that appears in both H and H, (separates the alternative from the null) is the 
null value. 


Test Procedures and P-Values 


A test procedure is a rule, based on sample data, for deciding whether H, should be 
rejected. The key issue will be the following: Suppose that Hp is in fact true. Then 
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how likely is it that a (random) sample at least as contradictory to this hypothesis as 
our sample would result? Consider the following two scenarios: 


1. There is only a .1% chance (a probability of .001) of getting a sample at least 
as contradictory to H) as what we obtained assuming that H) is true. 


2. There is a 25% chance (a probability of .25) of getting a sample at least as con- 
tradictory to Hy as what we obtained when A) is true. 


In the first scenario, something as extreme as our sample is very unlikely to have 
occurred when H, is true—in the long run only | in 1000 samples would be at least 
as contradictory to the null hypothesis as the one we ended up selecting. In contrast, 
for the second scenario, in the long run 25 out of every 100 samples would be at 
least as contradictory to H, as what we obtained assuming that the null hypothesis 
is true. So our sample is quite consistent with Ho, and there is no reason to reject it. 

We must now flesh out this reasoning by being more specific as to what is 
meant by “at least as contradictory to Hy as the sample we obtained when H, is true.” 
Before doing so in a general way, let’s consider several examples. 


EXAMPLE 8.1 The company that manufactures brand D Greek-style yogurt is anxious to increase 
its market share, and in particular persuade those who currently prefer brand C to 
switch brands. So the marketing department has devised the following blind taste 
experiment. Each of 100 brand C consumers will be asked to taste yogurt from two 
bowls, one containing brand C and the other brand D, and then say which one he 
or she prefers. The bowls are marked with a code so that the experimenters know 
which bowl contains which yogurt, but the experimental subjects do not have this 
information (Note: Such an experiment involving beers was actually carried out 
several decades ago, with the now defunct Schlitz beer playing the role of brand D 
and Michelob being the target beer). 

Let p denote the proportion of all brand C consumers who would prefer C to 
D in such circumstances. Let’s consider testing the hypotheses H,: p = .5 versus 
H,: p < .5. The alternative hypothesis says that a majority of brand C consumers 
actually prefer brand D. Of course the brand D company would like to have H) 
rejected so that H, is judged the more plausible hypothesis. If the null hypothesis is 
true, then whether a single randomly selected brand C consumer prefers C or D is 
analogous to the result of flipping a fair coin. 

The sample data will consist of a sequence of 100 preferences, each one a C or 
aD. Let X = the number among the 100 selected individuals who prefer C to D. This 
random variable will serve as our fest statistic, the function of sample data on which 
we’ ll base our conclusion. Now X is a binomial random variable (the number of suc- 
cesses in an experiment with a fixed number of independent trials having constant 
success probability p). When H, is true, this test statistic has a binomial distribution 
with p = .5, in which case E(X) = np = 100(.5) = 50. 

Intuitively, a value of X “considerably” smaller than 50 argues for rejection 
of H) in favor of H,. Suppose the observed value of X is x = 37. How contradic- 
tory is this value to the null hypothesis? To answer this question, let’s first identify 
values of X that are even more contradictory to H, than is 37 itself. Clearly 35 is 
one such value, and 30 is another; in fact, any number smaller than 37 is a value of 
X more contradictory to the null hypothesis than is the value we actually observed. 
Now consider the probability, computed assuming that the null hypothesis is true, of 
obtaining a value of X at least as contradictory to H, as is our observed value: 


P(X = 37 when A) is true) = P(X = 37 when X ~ Bin(100, .5)) 
= B(37; 100, .5) = .006 
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(from software). Thus if the null hypothesis is true, there is less than a 1% chance 
of seeing 37 or fewer successes amongst the 100 trials. This suggests that x = 37 
is much more consistent with the alternative hypothesis than with the null, and 
that rejection of H, in favor of H, is a sensible conclusion. In addition, note that 
Oy = Vanpq = V100(.5)(.5) = 5 when Hi, is true. It follows that 37 is more than 2.5 
standard deviations smaller than what we’d expect to see were Hp true. 

Now suppose that 45 of the 100 individuals in the experiment prefer C (45 
successes). Let’s again calculate the probability, assuming Hp true, of getting a test 
Statistic value at least as contradictory to H) as this: 


P(X = 45 when H, is true) = P(X = 45 when X ~ Bin(100, .5)) 
= B(45; 100, .5) = .184 


So if in fact p = .5, it would not be surprising to see 45 or fewer successes. For this 
reason, the value 45 does not seem very contradictory to Hp (it is only one standard 
deviation smaller than what we’d expect were H, true). Rejection of Hp in this case 
does not seem sensible. a 


EXAMPLE 8.2 According to the article “Freshman 15: Fact or Fiction” (Obesity, 2006: 1438- 
1443), “A common belief among the lay public is that body weight increases after 
entry into college, and the phrase ‘freshman 15’ has been coined to describe the 15 
pounds that students presumably gain over their freshman year.’ Let j denote the 
true average weight gain of women over the course of their first year in college. 
The foregoing quote suggests that we should test the hypotheses Hp: w = 15 versus 
H,:  # 15. For this purpose, suppose that a random sample of n such individuals is 
selected and the weight gain of each one is determined, resulting in a sample mean 
weight gain x and a sample standard deviation s (Note: The data here is actually 
paired, with each weight gain resulting from obtaining a (beginning, ending) weight 
pair and then subtracting to determine the difference; more will be said about such 
data in Section 9.3). Before data is obtained, the sample mean weight gain is a ran- 
dom variable X and the sample standard deviation is also a random variable S. 

A natural test statistic (function of the data on which the decision will be 
based) is the sample mean X itself; if H, is true, then E(X) = w = 15, whereas if w 
differs considerably from 15, then the sample mean weight gain should do the same. 
But there is a more convenient test statistic that has appealing intuitive and technical 
properties: the sample mean standardized assuming that Hp is true. Recall that the 
standard deviation (standard error) of X is ox=a/ V/n. Supposing that the popula- 
tion distribution of weight gains is normal, it follows that the sampling distribution 
of X itself is normal. Now standardizing a normally distributed variable gives a vari- 
able having a standard normal distribution (the z curve): 


a/Vn 


If the value of o were known, we could obtain a test statistic simply by replacing wu 
by the null value zy = 15: 


Z 


X-15 
a/Vn 


If substitution of x, 0, and n results in z = 3, the interpretation is that the observed 
value of the sample mean is three standard deviations larger than what we would 
have expected it to be were the null hypothesis true. Of course in “normal land” such 
an occurrence is exceedingly rare. Alternatively, if z = —1, then the sample mean is 
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only one standard deviation less than what would be expected under Hp, a result not 
surprising enough to cast substantial doubt on H). 

A practical glitch in the foregoing development is that the value of o is virtu- 
ally never available to an investigator. However, as discussed in the previous chapter, 
substitution of S for o in Z typically introduces very little extra variability when n 
is large (n > 40 was our earlier rule of thumb). In this case the resulting variable 
still has approximately a standard normal distribution. The implied large-sample test 
statistic for our weight-gain scenario is 


Oe Gee 
S/Vn 


Thus when H, is true, Z has approximately a standard normal distribution. 

Suppose that x = 13.7, and that substitution of this along with s and n gives 
z = —2.80. Which values of the test statistic are at least as contradictory to H) 
as —2.80 itself? To answer this, let’s first determine values of x that are at least as 
contradictory to Hy as 13.7. One such value is 13.5, another is 13.0, and in fact any 
value smaller than 13.7 is more contradictory to Hy than 13.7. 

But that is not the whole story. Recall that the alternative hypothesis says that 
the value of jz is something other than 15. In light of this, the value 16.3 is just as 
contradictory to H) as is 13.7; it falls the same distance above the null value 15 as 
13.7 does below 15—and the resulting z value is 3.0, just as extreme as —3.0. And 
any particular x that exceeds 16.3 is just as contradictory to Hp as is a value the same 
distance below 15—e.g., 16.8 and 14.2, 17.0 and 13.0, and so on. 

Just as values of x that are at most 13.7 correspond to z = —2.80, values of x 
that are at least 16.3 correspond to z = 2.80. Thus values of the test statistic that are 
at least as contradictory to H) as the value —2.80 actually obtained are {z: z= —2.80 
or z = 2.80}. We can now calculate the probability, assuming H) true, of obtaining a 
test statistic value at least as contradictory to H, as what our sample yielded: 


P(Z S —2.80 or Z = 2.80 assuming A true) 
=~ 2 - (area under the z curve to the right of 2.80) 
= 2[1 — B(2.80)] = 2[1 — .9974] = .0052 


That is, if the null hypothesis is in fact true, only about one half of one percent of 
all samples would result in a test statistic value at least as contradictory to the null 
hypothesis as is our value. Clearly —2.80 is among the possible test statistic values 
that are most contradictory to H,. It would therefore make sense to reject Hj in favor 
of H,. 

Suppose we had instead obtained the test statistic value z = .89, which is less 
than one standard deviation larger than what we’d expect if H, were true. The fore- 
going probability would then be 


P(Z = —0.89 or Z = 0.89 assuming H, true) 
=~ 2 - (area under the z curve to the right of .89) 
= 2[1 — B(.89)] = 2[1 — .8133] = .3734 


More than 1/3 of all samples would give a test statistic value at least as contradic- 
tory to H, as is .89 when H, is true. So the data is quite consistent with the null 
hypothesis; it remains plausible that 4 = 15. 

The article cited at the outset of this example reported that for a sam- 
ple of 137 students, the sample mean weight gain was only 2.42 lb with a 
sample standard deviation of 5.72 Ib (some students lost weight). This gives z = 
(2.42 — 15)/(5.72/V 137) = —25.7! The probability of observing a value at least 
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this extreme in either direction is essentially 0. The data very strongly contradicts the 
null hypothesis, and there is substantial evidence that true average weight gain is much 
closer to 0 than to 15. a 


The type of probability calculated in Examples 8.1 and 8.2 will now provide the 
basis for obtaining general test procedures. 


DEFINITIONS A test statistic is a function of the sample data used as a basis for deciding 
whether H) should be rejected. The selected test statistic should discriminate 
effectively between the two hypotheses. That is, values of the statistic that 
tend to result when H, is true should be quite different from those typically 
observed when H, is not true. 


The P-value is the probability, calculated assuming that the null hypothesis is 
true, of obtaining a value of the test statistic at least as contradictory to Hp as 
the value calculated from the available sample data. A conclusion is reached in 
a hypothesis testing analysis by selecting a number a, called the significance 
level (alternatively, level of significance) of the test, that is reasonably close to 
0. Then H) will be rejected in favor of H, if P-value = a, whereas Hy will not 
be rejected (still considered to be plausible) if P-value > a. The significance 
levels used most frequently in practice are (in order) a = .05, .01, .001, and .10. 


For example, if we select a significance level of .05 and then compute P-value = 
.0032, H, would be rejected because .0032 = .05. With this same P-value, the null 
hypothesis would also be rejected at the smaller significance level of .01 because 
.0032 = .01. However, at a significance level of .001 we would not be able to reject 
H, since .0032 > .001. Figure 8.1 illustrates the comparison of the P-value with the 
significance level in order to reach a conclusion. 


P-value = smallest level at which 
Ho can be rejected 


I 1 
Iu J 


0 (b) (a) 1 


Figure 8.1 Comparing @ and the P-value: (a) reject H, when a lies here; (b) do not reject H, 
when a lies here 


We will shortly consider in some detail the consequences of selecting a smaller 
significance level rather than a larger one. For the moment, note that the smaller the 
significance level, the more protection is being given to the null hypothesis and the 
harder it is for that hypothesis to be rejected. 

The definition of a P-value is obviously somewhat complicated, and it doesn’t 
roll off the tongue very smoothly without a good deal of practice. In fact, many users 
of statistical methodology use the specified decision rule repeatedly to test hypoth- 
eses, but would be hard put to say what a P-value is! Here are some important points: 


e The P-value is a probability. 
e This probability is calculated assuming that the null hypothesis is true. 


e To determine the P-value, we must first decide which values of the test statistic 
are at least as contradictory to H) as the value obtained from our sample. 
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e The smaller the P-value, the stronger is the evidence against H, and in favor 
of H,. 


e The P-value is not the probability that the null hypothesis is true or that it is 
false, nor is it the probability that an erroneous conclusion is reached. 


EXAMPLE 8.3 Urban storm water can be contaminated by many sources, including discarded bat- 
teries. When ruptured, these batteries release metals of environmental significance. 
The article “Urban Battery Litter” (J. Environ. Engr., 2009: 46-57) presented 
summary data for characteristics of a variety of batteries found in urban areas 
around Cleveland. A random sample of 51 Panasonic AAA batteries gave a sample 
mean zinc mass of 2.06 g. and a sample standard deviation of .141 g. Does this data 
provide compelling evidence for concluding that the population mean zinc mass 
exceeds 2.0 g.? Let’s employ a significance level of .01 to reach a conclusion. 
With yw denoting the true average zinc mass for such batteries, the relevant 
hypotheses are 


A: w= 2.0 versus H,: ww > 2.0. 


The reasonably large sample size allows us to invoke the Central Limit Theorem, 
according to which the sample mean X has approximately a normal distribution. 
Furthermore, the standardized variable Z = (X — y)/(S/V/n) has approximately a 
standard normal distribution (the z curve). The test statistic results from standard- 
izing X assuming that H, is true: 


x =20 
S/Vn 


Substituting n = 51, x = 2.06, and s = .141 gives z = .06/.0197 = 3.04. The sample 
mean here is roughly three (estimated) standard errors larger than would be expected 
were H, true (it does not appear to exceed 2 by very much, but there is only a small 
amount of variability in the 51 sample observations). 

Any value of x larger than 2.06 is more contradictory to Hp than 2.06 itself, 
and values of x that exceed 2.06 correspond to values of z that exceed 3.04. So any 
Zz = 3.04 is at least as contradictory to H,. Since the test statistic has approximately 
a standard normal distribution when H) is true, we have 


P-value ~ P(a standard normal rv is = 3.04) = 1 — (3.04) = 1 — .9988 = .0012 


Test statistic: Z = 


Because P-value = .0012 = .01 = a, the null hypothesis should be rejected at the cho- 
sen significance level. It appears that true average zinc mass does indeed exceed 2. Ml 


Errors in Hypothesis Testing 


The basis for choosing a particular significance level a lies in consideration of the errors 
that one might be faced with in drawing a conclusion. Recall the judicial scenario in 
which the null hypothesis is that the individual accused of committing a crime is in fact 
innocent. In rendering a verdict, the jury must consider the possibility of committing one 
of two different kinds of errors. One of these involves convicting an innocent person, and 
the other involves letting a guilty person go free. Similarly, there are two different types 
of errors that might be made in the course of a statistical hypothesis testing analysis. 


DEFINITIONS A type I error consists of rejecting the null hypothesis Hy when it is true. 


A type II error involves not rejecting H, when it is false. 
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As an example, a cereal manufacturer claims that a serving of one of its brands 
provides 100 calories (calorie content used to be determined by a destructive testing 
method, but the requirement that nutritional information appear on packages has led 
to more straightforward techniques). Of course the actual calorie content will vary 
somewhat from serving to serving (of the specified size), so 100 should be inter- 
preted as an average. It could be distressing to consumers of this cereal if the true 
average calorie content exceeded the asserted value. So an appropriate formulation 
of hypotheses is to test Hy: w = 100 versus H,: > 100. The alternative hypothesis 
says that consumers are ingesting on average a greater amount of calories than what 
the company claims. A type I error here consists of rejecting the manufacturer’s 
claim that ~~ = 100 when it is actually true. A type II error results from not rejecting 
the manufacturer’s claim when it is actually the case that w > 100. 

Suppose jz, and y2, represent the true average lifetimes for two different brands of 
rollerball pen under controlled experimental conditions (utilizing a machine that writes 
continuously until a pen fails). It is natural to test the hypotheses Hj: uw, — wb, = 0 
(i.e., Wy = M,) versus H,: “, — Mb, #0 (Le., Ww, A |). A type I error would be to con- 
clude that the true average lifetimes are different when in fact they are identical. A type 
II error involves deciding that the true average lifetimes may be the same when in fact 
they really differ from one another. 

In the best of all possible worlds, we’d have a judicial system that never con- 
victed an innocent person and never let a guilty person go free. This gold standard 
for judicial decisions has proven to be extremely elusive. Similarly, we would like 
to find test procedures for which neither type of error is ever committed. However, 
this ideal can be achieved only by basing a conclusion on an examination of the 
entire population. The difficulty with using a procedure based on sample data is that 
because of sampling variability, a sample unrepresentative of the population may 
result. In the calorie content scenario, even if the manufacturer’s assertion is cor- 
rect, an unusually large value of X may result in a P-value smaller than the chosen 
significance level and the consequent commission of a type I error. Alternatively, the 
true average calorie content may exceed what the manufacturer claims, yet a sample 
of servings may yield a relatively large P-value for which the null hypothesis cannot 
be rejected. 

Instead of demanding error-free test procedures, we must seek procedures for 
which either type of error is unlikely to be committed. That is, a good procedure is 
one for which the probability of making a type I error is small and the probability of 
making a type II error is also small. 


EXAMPLE 8.4 An automobile model is known to sustain no visible damage 25% of the time in 
10-mph crash tests. A modified bumper design has been proposed in an effort to 
increase this percentage. Let p denote the proportion of all 10-mph crashes with this 
new bumper that result in no visible damage. The hypotheses to be tested are H): 
p = .25 (no improvement) versus H,: p > .25. The test will be based on an experi- 
ment involving n = 20 independent crashes with prototypes of the new design. The 
natural test statistic here is X = the number of crashes with no visible damage. If 
Hi, is true, E(X) = np, = (20)(.25) = 5. Intuition suggests that an observed value 
x much larger than this would provide strong evidence against H) and in support 
of H,. 

Consider using a significance level of .10. The P-value is P(X = x when X has 
a binomial distribution with n = 20 and p =.25) = 1 — B(x — 1; 20, .25) for x > 0. 
Appendix Table A.1 shows that in this case, 


P(X = 7) = 1 — B(6; 20, .25) = 1 — .786 = .214 
P(X = 8) = 1 — .898 = .102 ~ .10, P(X = 9) = 1 —.959 = .041 
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Thus rejecting H, when P-value = .10 is equivalent to rejecting H, when X = 8. 
Therefore 


P(committing a type I error) = P(rejecting Hy when Hj is true) 
= P(X = 8 when X has a binomial distribution with 
n = 20 and p =.25) 
= .102 
= 10 


That is, the probability of a type I error is just the significance level a. If the null 
hypothesis is true here and the test procedure is used over and over again, each time 
in conjunction with a group of 20 crashes, in the long run the null hypothesis will be 
incorrectly rejected in favor of the alternative hypothesis about 10% of the time. So 
our test procedure offers reasonably good protection against committing a type I error. 

There is only one type I error probability because there is only one value of the 
parameter for which H, is true (this is one benefit of simplifying the null hypothesis 
to a claim of equality). Let 6 denote the probability of committing a type II error. 
Unfortunately there is not a single value of 8, because there are a multitude of ways 
for Hy to be false—it could be false because p = .30, because p = .37, because p = 
.5, and so on. There is in fact a different value of 6 for each different value of p that 
exceeds .25. At the chosen significance level .10, H, will be rejected if and only if 
X = 8, so Hy will not be rejected if and only if X = 7. Thus 


B(.3) = P(type II error when p = .3) 
= P(A) is not rejected when p = .3) 
= P[X = 7 when X ~ Bin(20, .3)] 
= B(7; 20, .3) = .772 
When p is actually .3 rather than .25 (a “small” departure from H), roughly 77% of 
all experiments of this type would result in H, being incorrectly not rejected! 

The accompanying table displays B for selected values of p (each calculated 
as we just did for B(.3)). Clearly, 6 decreases as the value of p moves farther to the 
right of the null value .25. Intuitively, the greater the departure from Hp, the more 
likely it is that such a departure will be detected. 

Pp | 33 4 5 6 a 8 
py) | 772 416 ~=132—S—02 001 .000 


The probability of committing a type II error here is quite large when p = .3 or .4. 
This is because those values are quite close to what H, asserts and the sample size of 
20 is too small to permit accurate discrimination between .25 and those values of p. 

The proposed test procedure is still reasonable for testing the more realistic null 
hypothesis that p = .25. In this case, there is no longer a single type I error probability 
a, but instead there is an a for each p that is at most .25: a(.25), a(.23), a(.20), a(.15), 
and so on. It is easily verified, though, that a(p) < a(.25) = .102 if p < .25. That is, 
the largest type I error probability occurs for the boundary value .25 between H, and 
H,. Thus if a is small for the simplified null hypothesis, it will also be as small as or 
smaller for the more realistic Hp. a 


EXAMPLE 8.5 The drying time of a type of paint under specified test conditions is known to 
be normally distributed with mean value 75 min and standard deviation 9 min. 
Chemists have proposed a new additive designed to decrease average drying time. It 
is believed that drying times with this additive will remain normally distributed with 
o = 9. Because of the expense associated with the additive, evidence should strongly 
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suggest an improvement in average drying time before such a conclusion is adopted. 
Let ps denote the true average drying time when the additive is used. The appropriate 
hypotheses are Hy: w = 75 versus H,: w < 75. Only if Hp can be rejected will the 
additive be declared successful and used. 

Experimental data is to consist of drying times from n = 25 test specimens. 
Let X,,...,X,; denote the 25 drying times—a random sample of size 25 from a 
normal distribution with mean value ps and standard deviation o = 9 (although the 
assumption of a known value of o is generally unrealistic in practice, it considerably 
simplifies calculation of type II error probabilities). The sample mean drying time 
X then has a normal distribution with expected value Hs = pw and standard deviation 
Oz = o/Vn = 9/V'25 = 1.8. When H, is true, we expect X to be 75; a sample 
mean much smaller than this would be contradictory to H) and supportive of H,,. 

Our test statistic here will be X standardized assuming that H, is true: 


_X-by X-75 
o/Vn 1.8 


The sampling distribution of X is normal because the population distribution is nor- 
mal, which implies that Z has a standard normal distribution when H, is true (in con- 
trast to Examples 8.2 and 8.3, we are assuming and using a known value of o here). 

Consider carrying out the test using a significance level of .01, 1.e., Hp will be 
rejected if P-value = .01. For a given value x of the sample mean and corresponding 
calculated value z, the form of the alternative hypothesis implies that values more 
contradictory to Hp than this are values less than x and, correspondingly, values of 
the test statistic that are less than z. Thus the P-value is 


Z 


P-value = P(obtaining a value of Z at least as contradictory to 
H, as z when H, is true) 


= P(Z = z when H, is true) 
= area under the standard normal curve to the left of z 
= P(z) 


So the P-value will equal .01 when z captures lower-tail area .01 under the z curve. 
From Appendix Table A.3, this happens when z = —2.33 [verify that B(—2.33) = .01]. 
As illustrated in Figure 8.2, the P-value will therefore be at most .01 when z = —2.33. 
This in turn implies that 


P(type I error) = P(rejecting H, when H, is true) 
= P(P-value = .01 when A) is true) 
= P(Z S —2.33 when Z has a standard normal distribution) 
= 01 


z (standard normal) curve 


OL 


0 


=2:33 
Figure 8.2 P-value < .01 if and only if z= —2.33 


As in the previous example, the chosen significance level a is in fact the probability 
of committing a type I error. If the above test procedure [test statistic Z, reject Hy if 
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P-value = .01] is used repeatedly on sample after sample, in the long run the null 
hypothesis will be incorrectly rejected only 1% of the time. Our proposed test procedure 
offers excellent protection against the commission of a type I error. Note that if the more 
realistic null hypothesis Hp: 4 = 75 is considered, it can be shown that P(type I error) = 
.01; the maximum occurs at the null value 75, which is the boundary between H) and H,. 

The calculation of P(type I error) in this example relied on the fact that P-value = 
01 is equivalent to Z = (X — 75)/1.8 < —2.33. Multiplying both sides of this latter 
inequality by 1.8 and then adding 75 to both sides results in X < 70.8. Thus rejecting 
H, at significance level .01 [if P-value < .01] is equivalent to rejecting H, if X < 70.8; 
Hy, will not be rejected if X > 70.8. The probability of committing a type II error when 
pb = 72 is now 


B(72) = P(not rejecting H) when pp = 72) 
= P(X > 70.8 when X ~ normal with by = 72, oy = 1.8) 
1 — &[{(70.8 — 72)/1.8] = 1 — &(—.67) = 1 — .2514 = .7486 


This is an awfully large error probability. If the test with a = .01 is used repeatedly 
on sample after sample and the actual value of yw is 72, almost 75% of the time the 
null hypothesis will not be rejected. The difficulty is that 72 is too close to the null 
value for a test with this sample size and value of a to have a good chance of detect- 
ing such a departure from H). 

Similar calculations give 


B70) = 1 — ®[(70.8 — 70)/1.8] = .3300, B(67) = .0174 
These type II error probabilities are much smaller than 6(72) because 70 and 67 are 


both farther away from the null value than is 72. Figure 8.3 illustrates a and the first 
two type II error probabilities. 


Shaded area = a = .01 


Shaded area = B(72) 


Shaded area = 6(70) 


Figure 8.3 a and illustrated for Example 8.5: (a) the distribution of X when w = 75 (H, true); 
(b) the distribution of X when w = 72 (H, false); (c) the distribution of X when jz = 70 (H, false) 
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The investigators might regard 4 = 72 as an important departure from the 
null hypothesis, in which case B(72) = .7486 is intolerably large. Consider chang- 
ing the significance level (type I error probability from .01 to .05; that is, we now 
propose rejecting H, if P-value = .05. Appendix Table A.3 shows that the z critical 
value — 1.645 captures a lower-tail z curve area of .05. Using the same reasoning that 
we previously applied when a = .01, rejecting H, when P-value = .05 is equivalent 
to rejecting when Z < —1.645. This in turn is equivalent to rejecting when X < 72 
(notice that by increasing the significance level, we have made it easier for the null 
hypothesis to get rejected). Proceeding as in the previous calculations, we find that 


B72) = 5, B70) = .1335, (67) = .0027 


These type IJ error probabilities are all smaller than their counterparts for the test 
with a = .01. The important message here is that if a larger significance level (type 
I error probability) can be tolerated, then the resulting test will have better ability to 
detect when the null hypothesis is false. a 


It is no accident that in the two foregoing examples, the significance level a turned 
out to be the probability of a type I error. 


PROPOSITION The test procedure that rejects Hy if P-value = a and otherwise does not reject 
H, has P(type I error) = a. That is, the significance level employed in the test 
procedure is the probability of a type I error. 


A partial proof of this proposition is sketched out at the end of the section. 
The inverse relationship between the significance level a and type II error 
probabilities in Example 8.5 can be generalized in the following manner: 


PROPOSITION Suppose an experiment or sampling procedure is selected, a sample size is 
specified, and a test statistic is chosen. Then increasing the significance level 
a, i.e., employing a larger type I error probability, results in a smaller value of 
B for any particular parameter value consistent with H,. 


This result is intuitively obvious because when a is increased, it becomes more 
likely that we'll have P-value = a and therefore less likely that P-value > a. 

The proposition implies that once the test statistic and n are fixed, it is not pos- 
sible to make both a@ and any values of B that might be of interest arbitrarily small. 
Deciding on an appropriate significance level involves compromising between small 
a and small B’s. In Example 8.5, the type II error probability for a test with a = .01 
was quite large for a value of yz close to the value in H,. A strategy that is sometimes 
(but perhaps not often enough) used in practice is to specify a and also B for some 
alternative value of the parameter that is of particular importance to the investiga- 
tor. Then the sample size n can be determined to satisfy these two conditions. For 
example, the article “Cognitive Treatment of Illness Perceptions in Patients with 
Chronic Low Back Pain: A Randomized Controlled Trial” (Physical Therapy, 
2013: 435-438) contains the following passage: “A decrease of 18 to 24 mm on 
the PSC was determined as being a clinically relevant change in patients with low 
back pain. The sample size was calculated with a minimum change of 18 mm, a 
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2-sided a of .05, a 1 — B of .90, and a standard deviation of 26.01.... The sample 
size calculation resulted in a total of 135 participants.” We’ll consider such sample 
size determinations in subsequent sections and chapters. 

In practice it is usually the case that the hypotheses of interest can be formu- 
lated so that a type I error is more serious than a type II error. The approach adhered 
to by most statistical practitioners is to reflect on the relative seriousness of a type I 
error compared to a type II error and then use the largest value of a that can be toler- 
ated. This amounts to doing the best we can with respect to type II error probabilities 
while ensuring that the type I error probability is sufficiently small. For example, if 
a = .05 is the largest significance level that can be tolerated, it would be better to 
use that rather than a = .01, because all 6’s for the former a@ will be smaller than 
those for the latter one. As previously mentioned, the most frequently employed sig- 
nificance levels are a = .05, .01, .001, and .10. However, there are exceptions. Here 
is one example from particle physics: according to the article “Discovery or Fluke: 
Statistics in Particle Physics” (Physics Today, July 2012: 45-50), “the usual 
choice of alpha is 3 X 107’, corresponding to the 5a of a Gaussian [i.e., normal] Hy 
distribution. ... Why so stringent? For one thing, recent history offers many caution- 
ary examples of exciting 3o and 4o signals that went away when more data arrived.” 

If the distribution of the test statistic is continuous (e.g., if the test statistic has 
the standard normal distribution or a particular f distribution when H) is true), then 
any significance level a between 0 and | can be employed—for example, reject H, 
if P-value = .035. However, this is not necessarily the case if the distribution of the 
test statistic is discrete. As an example, consider again the bumper design scenario of 
Example 8.4 in which the hypotheses of interest were Hp: p = .25 versus H,: p > .25. 
The test statistic X had a binomial distribution and 


P-value = P(X = x when n = 20 and p = .25) 


Appendix Table A.1 shows that in this case, P(X = 8) = .102 and P(X = 9) = .041. 
Thus if we want the significance level to be .05, the closest achievable level is actu- 
ally .041: reject Hp if P-value = .041. 


Some Further Comments on the P-Value 


Suppose that the P-value is calculated to be .038. The null hypothesis will then be 
rejected if .038 = a and not rejected otherwise. So Hy can be rejected if a = .10 
or .05 but not if a = .01 or .001. In fact, Hy would be rejected for any significance 
level that is at least .038 but not for any level smaller than .038. For this reason, the 
P-value is often referred to as the observed significance level (OSL): it is the small- 
est value of a for which H, can be rejected. 

One very appealing aspect of basing a conclusion from a hypothesis test- 
ing analysis on the P-value is that all widely used statistical software packages 
will calculate and output the P-value for any of the commonly used test proce- 
dures. Once the P-value is available, the investigator need only compare it to the 
selected significance level to decide whether H, should be rejected. This explains 
how an investigator can forget the definition of a P-value and still use it to reach 
a conclusion! 

Sometimes a situation is encountered in which various individuals are interested 
in testing the same pair of hypotheses but may wish to use different significance levels. 
For example, suppose the true average time to pain relief for the current best-selling 
pain reliever is known to be 15 minutes. A new formulation has been developed that it 
is hoped will reduce this time. The relevant hypotheses are H,: = 15 versus H,: wb < 
15, where y is the true average time to relief using the new formulation. You may be 
quite satisfied with the current product and therefore wish to use a small significance 
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level such as .01. I on the other hand may be less satisfied and thus more willing to 
switch, in which case a larger level such as .10 may be sensible. In using the larger a, 
Iam giving less protection to H, than you are. Once the P-value is available, each of us 
can employ our own significance level irrespective of what the other person is using. 
Thus when medical journals report a P-value, a significance level is not mandated; 
instead it is left to the reader to select his or her own level and conclude accordingly. 
Furthermore, if someone else carried out the test and simply reported that H, was 
rejected at significance level .05 without revealing the P-value, then anyone wishing 
to use a smaller significance level would not know which conclusion is appropriate. 
That individual would have imposed his or her own significance level on other deci- 
sion makers. Access to the P-value prevents such an imposition. 

A final point concerning the utility of the P-value is that it allows one to 
distinguish between a close call and a very clear-cut conclusion at any particular 
significance level. For example, suppose you are told that Hy was rejected at sig- 
nificance level .05. This conclusion is consistent with a P-value of .0498 and also 
with a P-value of .0003, since in each case P-value = a = .05. But of course with 
a P-value of .0498, the null hypothesis is barely rejected, whereas with P-value = 
.0003, the null hypothesis is rejected by a country mile. So it is always preferable to 
report the P-value rather than just stating the conclusion at a particular significance 
level. 

Unfortunately most journal articles containing summaries of hypothesis test- 
ing analyses do not report exact P-values. Instead what typically appears is one of 
the following statements: “P < .05” if the P-value is between .05 and .01, “P< .01” 
if it is between .01 and .001, and “P < .001” if the P-value really is smaller than 
.001. In a tabular summary, you will often see *, **, and *** corresponding to these 
three cases. 


Proof of the proposition stating that P(type I error) = the significance level a: 


Denote the test statistic by Y, and let F(-) be the cumulative distribution function 
of Y when H, is true (e.g., F might be the standard normal cdf ® or the cdf of an rv 
having a f distribution with some specified number of df). Suppose the distribution 
of Y is continuous over some interval (often infinite in extent) so that F is a strictly 
increasing function over this interval. Then F has a well-defined inverse function F ~!. 
Consider the case in which only values of the test statistic smaller than the calculated 
value y are more contradictory to H, than y itself. This implies that 


P-value = P(obtaining a test statistic value at least 
as contradictory to H) when H, is true) = F(y) 


Now before the sample data is available, the value of the test statistic is a random 
variable Y, and so the P-value itself is a random variable. Thus 


P(type I error) = P(P-value = a when H, is true) = P(F(Y) = a) 


Let’s now apply F~! to both sides of the inequality inside the last set of paren- 
theses: 


P(type I error) = P[F-'(F(Y)) S F-'(a)] = P(Y S F(a) = FF “\(a)) = 


The argument in the case in which only values of Y larger than y are more contra- 
dictory to Hy than y itself is similar to what we have just shown. The case in which 
either large or small Y values are more contradictory to H, than y itself is a bit 
trickier. And when the test statistic has a discrete distribution, the inverse func- 
tion F ~' is not uniquely defined, so extra care is needed to make the argument 
valid. 
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EXERCISES Section 8.1 (1-14) 


For each of the following assertions, state whether it is a Ve 


legitimate statistical hypothesis and why: 

a. H:o > 100 b. H: x = 45 

ce. H:s =.20 d. H:0,/0, <1 

e. H:X—Y=5 

f. H:d = .01, where A is the parameter of an exponen- 
tial distribution used to model component lifetime 


For the following pairs of assertions, indicate which do = 
not comply with our rules for setting up hypotheses and 

why (the subscripts | and 2 differentiate between quanti- 

ties for two different populations or samples): 

a. Hy: w = 100, H,: uw > 100 

b. Hy: 0 = 20, H,: 0 = 20 

ce. Hy: p # .25, H,: p = .25 

d. Ay: by — My = 25, Ay: wy — by > 100 

e. Hy: Si = S, Hy Si # SS 

f. Ay: w = 120, A,: w = 150 

g. Hy: o0,/0, = 1,H,:0,/0, #1 

h. Ay: Pp) — pp = —.1, Ay py — pp < — 1 

For which of the given P-values would the null hypoth- 9. 
esis be rejected when performing a level .05 test? 

a. .001 b. .021 c. .078 

d. .047 e. .148 

Pairs of P-values and significance levels, a, are given. 

For each pair, state whether the observed P-value would 

lead to rejection of H) at the given significance level. 

a. P-value = .084, a = .05 

b. P-value = .003,a = .001 

c. P-value = .498,a = .05 

d. P-value = .084, a = .10 

e. P-value = .039,a = .01 

f. P-value = .218,a = .10 10. 


To determine whether the pipe welds in a nuclear power 
plant meet specifications, a random sample of welds is 
selected, and tests are conducted on each weld in the 
sample. Weld strength is measured as the force required 
to break the weld. Suppose the specifications state that 
mean strength of welds should exceed 100 Ib/in?; the 
inspection team decides to test Hy: w = 100 versus 
H,: w > 100. Explain why it might be preferable to use 
this H, rather than wp < 100. 


Let pw denote the true average radioactivity level (picocu- 


ries per liter). The value 5 pCi/L is considered the divid- 11. 


ing line between safe and unsafe water. Would you rec- 
ommend testing Hp: w = 5 versus H,: w > Sor Hy: w = 5 
versus H,: w <5? Explain your reasoning. [Hint: Think 
about the consequences of a type I and type II error for 
each possibility. ] 


Before agreeing to purchase a large order of polyethyl- 
ene sheaths for a particular type of high-pressure oil- 
filled submarine power cable, a company wants to see 
conclusive evidence that the true standard deviation of 
sheath thickness is less than .05 mm. What hypotheses 
should be tested, and why? In this context, what are the 
type I and type IJ errors? 


Many older homes have electrical systems that use fuses 
rather than circuit breakers. A manufacturer of 40-amp 
fuses wants to make sure that the mean amperage at 
which its fuses burn out is in fact 40. If the mean amper- 
age is lower than 40, customers will complain because 
the fuses require replacement too often. If the mean 
amperage is higher than 40, the manufacturer might be 
liable for damage to an electrical system due to fuse 
malfunction. To verify the amperage of the fuses, a 
sample of fuses is to be selected and inspected. If a 
hypothesis test were to be performed on the resulting 
data, what null and alternative hypotheses would be of 
interest to the manufacturer? Describe type I and type II 
errors in the context of this problem situation. 


Water samples are taken from water used for cooling as it 
is being discharged from a power plant into a river. It has 
been determined that as long as the mean temperature of 
the discharged water is at most 150°F, there will be no 
negative effects on the river’s ecosystem. To investigate 
whether the plant is in compliance with regulations that 
prohibit a mean discharge water temperature above 150°, 
50 water samples will be taken at randomly selected times 
and the temperature of each sample recorded. The resulting 
data will be used to test the hypotheses Hy: 4 = 150° ver- 
sus H,: 2 > 150°. In the context of this situation, describe 
type I and type II errors. Which type of error would you 
consider more serious? Explain. 


A regular type of laminate is currently being used by a 
manufacturer of circuit boards. A special laminate has 
been developed to reduce warpage. The regular laminate 
will be used on one sample of specimens and the special 
laminate on another sample, and the amount of warpage 
will then be determined for each specimen. The manufac- 
turer will then switch to the special laminate only if it can 
be demonstrated that the true average amount of warpage 
for that laminate is less than for the regular laminate. State 
the relevant hypotheses, and describe the type I and type II 
errors in the context of this situation. 


Two different companies have applied to provide cable 
television service in a certain region. Let p denote the 
proportion of all potential subscribers who favor the first 
company over the second. Consider testing H,: p = .5 
versus H,: p # .5 based on a random sample of 25 indi- 
viduals. Let the test statistic X be the number in the 
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12. 


sample who favor the first company and x represent the 

observed value of X. 

a. Describe type I and II errors in the context of this 
problem situation. 

b. Suppose that x = 6. Which values of X are at least as 
contradictory to Hp as this one? 

c. What is the probability distribution of the test statis- 
tic X when H) is true? Use it to compute the P-value 
when x = 6. 

d. If H, is to be rejected when P-value = .044, compute 
the probability of a type II error when p = .4, again 
when p = .3, and also when p = .6 and p = .7. [Hint: 
P-value > .044 is equivalent to what inequalities 
involving x (see Example 8.4)?] 

e. Using the test procedure of (d), what would you 
conclude if 6 of the 25 queried favored company 1? 


A mixture of pulverized fuel ash and Portland cement to 
be used for grouting should have a compressive strength 
of more than 1300 KN/m*. The mixture will not be used 
unless experimental evidence indicates conclusively that 
the strength specification has been met. Suppose compres- 
sive strength for specimens of this mixture is normally 
distributed with o = 60. Let denote the true average 
compressive strength. 

a. What are the appropriate null and alternative 
hypotheses? 

b. Let X denote the sample average compressive strength 
for n = 10 randomly selected specimens. Consider 
the test procedure with test statistic X itself (not stan- 
dardized). If x = 1340, should H) be rejected using a 
significance level of .01? [Hint: What is the probabil- 
ity distribution of the test statistic when H) is true?] 

c. What is the probability distribution of the test statistic 
when yp = 1350? For a test with a = .01, what is the 


13. 


14. 


probability that the mixture will be judged unsatisfac- 
tory when in fact ~ = 1350 (a type II error)? 


The calibration of a scale is to be checked by weighing a 

10-kg test specimen 25 times. Suppose that the results of 

different weighings are independent of one another and 
that the weight on each trial is normally distributed with 

o = .200kg. Let pw denote the true average weight reading 

on the scale. 

a. What hypotheses should be tested? 

b. With the sample mean itself as the test statistic, what 
is the P-value when x = 9.85, and what would you 
conclude at significance level .01? 

c. For a test with a = .01, what is the probability that 
recalibration is judged unnecessary when in fact w = 
10.1? When p = 9.8? 


A new design for the braking system on a certain type of 
car has been proposed. For the current system, the true 
average braking distance at 40 mph under specified con- 
ditions is known to be 120 ft. It is proposed that the new 
design be implemented only if sample data strongly 
indicates a reduction in true average braking distance for 
the new design. 

a. Define the parameter of interest and state the relevant 
hypotheses. 

b. Suppose braking distance for the new system is 
normally distributed with o = 10. Let X denote the 
sample average braking distance for a random 
sample of 36 observations. Which values of x are 
more contradictory to Hy than 117.2, what is the 
P-value in this case, and what conclusion is appro- 
priate if a = .10? 

c. What is the probability that the new design is not 
implemented when its true average braking distance 
is actually 115 ft and the test from part (b) is used? 


8.2 z Tests for Hypotheses about a Population Mean 


Recall from the previous section that a conclusion in a hypothesis testing analysis is 
reached by proceeding as follows: 


i. Compute the value of an appropriate test statistic. 


ii. Then determine the P-value—the probability, calculated assuming that the null 
hypothesis H, true, of observing a test statistic value at least as contradictory to 
Hi, as what resulted from the available data. 


iii. Reject the null hypothesis if P-value <= a, where a is the specified or chosen 
significance level, i.e., the probability of a type I error (rejecting Hy when it is 
true); if P-value > a, there is not enough evidence to justify rejecting Hp (it is 


still deemed plausible). 


Determination of the P-value depends on the distribution of the test statistic when H) 
is true. In this section we describe z tests for testing hypotheses about a single popula- 
tion mean yw. By “z test?’ we mean that the test statistic has at least approximately a 
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standard normal distribution when H, is true. The P-value will then be a z curve area 
which depends on whether the inequality in H, is >, <, or #. 

In the development of confidence intervals for w in Chapter 7, we first consid- 
ered the case in which the population distribution is normal with known o, then relaxed 
the normality and known o assumptions when the sample size n is large, and finally 
described the one-sample ¢ CI for the mean of a normal population. In this section we 
discuss the first two cases, and then present the one-sample ¢ test in Section 8.3. 


A Normal Population Distribution 
with Known « 


Although the assumption that the value of o is known is rarely met in practice, this 
case provides a good starting point because of the ease with which general proce- 
dures and their properties can be developed. The null hypothesis in all three cases 
will state that ws has a particular numerical value, the null value. We denote this value 
by the symbol fo, so the null hypothesis has the form Hy: w = po. Let X),..., X,, 
represent a random sample of size n from the normal population. Then the sample 
mean X has a normal distribution with expected value wx = p and standard devia- 
tion oy = o/Vn. When Hy is true, wx = My. Consider now the statistic Z obtained 
by standardizing X under the assumption that H, is true: 


Substitution of the computed sample mean x gives z, the distance between x and 
My expressed in “standard deviation units.” For example, if the null hypothesis is 
Ho: « = 100, og = o/Vn = 10/25 = 2.0, and x = 103, then the test statistic 
value is z = (103 — 100)/2.0 = 1.5. That is, the observed value of x is 1.5 standard 
deviations (of X) larger than what we expect it to be when H, is true. The statistic Z 
is a natural measure of the distance between X, the estimator of 1, and its expected 
value when Hp is true. If this distance is too great in a direction consistent with H,, 
there is substantial evidence that H, is false. 

Suppose first that the alternative hypothesis is of the form H,: @ > fo. Then an 
x value that considerably exceeds fy provides evidence against Hy. Such an x value 
corresponds to a large positive value z. This in turn implies that any value exceeding 
the calculated z is more contradictory to H) than is z itself. It follows that 


P-value = P(Z = z when H, is true) 


Now here is the key point: when H) is true, the test statistic Z has a standard normal 
distribution—because we created Z by standardizing X assuming that H, is true (i.e., 
by subtracting 1,). The implication is that in this case, the P-value is just the area 
under the standard normal curve to the right of z. Because of this, the test is referred 
to as upper-tailed. For example, in the previous paragraph we calculated z = 1.5. 
If in the alternative hypothesis there is H,: > 100, then P-value = area under the 
z curve to the right of 1.5 = 1 — (1.50) = .0668. At significance level .05 we 
would not be able to reject the null hypothesis because the P-value exceeds a. 

Now consider an alternative hypothesis of the form H,: fw < po. In this case 
any value of the sample mean smaller than our x is even more contradictory to the 
null hypothesis. Thus any test statistic value smaller than the calculated z is more 
contradictory to Hp than is z itself. It follows that 


P-value = P(Z = z when H, is true) 


area under the standard normal curve to the left of z = ®(z) 
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The test in this case is customarily referred to as lower-tailed. If, for example, the 
alternative hypothesis is H,:  < 100 and z = —2.75, then P-value = ®(—2.75) = 
.0030. This is small enough to justify rejection of Hp at a significance level of either 
.O5 or .01, but not .001. 

The third possible alternative, H,: w ~ to, requires a bit more careful thought. 
Suppose, for example, that the null value is 100 and that x = 103 results in z = 1.5. 
Then any x value exceeding 103 is more contradictory to H, than is 103 itself. So any 
z exceeding 1.5 is likewise more contradictory to H, than is 1.5. However, 97 is just as 
contradictory to the null hypothesis as is 103, since it is the same distance below 100 as 
103 is above 100. Thus z = —1.5 is just as contradictory to Hp as is z = 1.5. Therefore 
any z smaller than — 1.5 is more contradictory to Hy than is 1.5 or —1.5. It follows that 


P-value = P(Z either = 1.5 or = —1.5 when A, is true) 
= (area under the z curve to the right of 1.5) 
+ (area under the z curve to the left of —1.5) 
= 1— @(1.5) + ®(—1.5) = 2[1 — &(1.5)] 
= 2(.0668) = .1336 


This would also be the P-value if x = 97 results in z = —1.5. The important point is 
that because of the inequality ~ in H,, the P-value is the sum of an upper-tail area and 
a lower-tail area. By symmetry of the standard normal distribution, this becomes twice 
the area captured in the tail in which z falls. Equivalently, it is twice the area captured 
in the upper tail by |z|, i.e., 2[1 — ®(|z|)]. It is natural to refer to this test as being 
two-tailed because z values far out in either tail of the z curve argue for rejection of Hp. 

The test procedure is summarized in the accompanying box, and the P-value 
for each of the possible alternative hypotheses is illustrated in Figure 8.4. 


Zz curve 


P-value = area in upper tail 
1. Upper-tailed test =1- 0&2) 


H, contains the inequality > 


Calculated z 


z curve 
P-value = area in lower tail 


2. Lower-tailed test =O 
H, contains the inequality < 


Calculated z 


P-value = sum of area in two tails= 2[1 — (lzl)] 


Z curve 


3. Two-tailed test 


H, contains the inequality # 
1 


= EE 
— 
Calculated z, —z 


Figure 8.4 Determination of the P-value for a z test 
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Null hypothesis: Hp: uw = bo 


X— Mo 
a/Vn 


Alternative Hypothesis P-Value Determination 


Test statistic: Z = 


lobe jiy = [Uy Area under the standard normal curve to the right of z 

lal e He fig Area under the standard normal curve to the left of z 

H,:  # Bo 2 - (area under the standard normal curve to the 
right of | z|) 


Assumptions: A normal population distribution with known value of o. 


Use of the following sequence of steps is recommended when testing hypotheses 
about a parameter. The plausibility of any assumptions underlying use of the selected 
test procedure should of course be checked before carrying out the test. 


1. 


Identify the parameter of interest and describe it in the context of the problem 
situation. 


Determine the null value and state the null hypothesis. 
State the appropriate alternative hypothesis. 


Give the formula for the computed value of the test statistic (substituting the 
null value and the known values of any other parameters, but not those of any 
sample-based quantities). 


Compute any necessary sample quantities, substitute into the formula for the 
test statistic value, and compute that value. 


Determine the P-value. 


Compare the selected or specified significance level to the P-value to decide 
whether H, should be rejected, and state this conclusion in the problem context. 


The formulation of hypotheses (Steps 2 and 3) should be done before examining the 
data, and the significance level a should be chosen prior to determination of the P-value. 


EXAMPLE 8.6 A manufacturer of sprinkler systems used for fire protection in office buildings claims 
that the true average system-activation temperature is 130°. A sample of n = 9 sys- 
tems, when tested, yields a sample average activation temperature of 131.08°F. If the 
distribution of activation times is normal with standard deviation 1.5°F, does the data 
contradict the manufacturer’s claim at significance level a = .01? 


1. 
2. 
3. 


Parameter of interest: 4 = true average activation temperature. 
Null hypothesis: Hp: w = 130 (null value = py = 130). 


Alternative hypothesis: H,: w # 130 (a departure from the claimed value in 
either direction is of concern). 


Test statistic value: 
_X7 Myo _ x— 130 
o/Vn 1.5/Vn 
Substituting n = 9 and x = 131.08, 
— 131.08 — 130 — 1.08 _ 


1.5/V9 5 


z 


2.16 
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That is, the observed sample mean is a bit more than 2 standard deviations 
above what would have been expected were Hp true. 


6. The inequality in H, implies that the test is two-tailed, so the P-value results 
from doubling the captured tail area: 


P-value = 2[1 — ®(2.16)] = 2(.0154) = .0308 


7. Because P-value = .0308 > .01 = a, H, cannot be rejected at significance level 
.O1. The data does not give strong support to the claim that the true average dif- 
fers from the design value of 130. 


£ and Sample Size Determination The z tests with known o are among the few 
in statistics for which there are simple formulas available for 8, the probability of a 
type II error. Consider first the alternative H,: > [1o. The null hypothesis is rejected 
if P-value = a, and the P-value is the area under the standard normal curve to the 
right of z. Suppose that a = .05. The z critical value that captures an upper-tail area 
of .05 is zg; = 1.645 (look for a cumulative area of .95 in Table A.3). Thus if the 
calculated test statistic value z is smaller than 1.645, the area to the right of z will 
be larger than .05 and the null hypothesis will then not be rejected. Now substitute 
(X — Bo) /(o/ V/n) in place of z in the inequality z < 1.645 and manipulate to isolate 
x on the left (multiply both sides by o/‘V/n and then add Mo to both sides). This gives 
the equivalent inequality x < py + z, ° o/ Vn. Now let x’ denote a particular value 
of ys that exceeds the null value 1p. Then, 


Bw’) = P(A, is not rejected when pp = p’) 
= P(X < py) +z, ° o/Vn when p = pw’) 


X= p' My — Be ) 
=P <z 7 when po = p’ 
on o/Vn ae cs 
My ~ B’ 
= of, + =) 
( a/Vn 


As ' increases, 4) — 2’ becomes more negative, so B(j2’) will be small when p’ 
greatly exceeds pi, (because the value at which © is evaluated will then be quite 
negative). Error probabilities for the lower-tailed and two-tailed tests are derived in 
an analogous manner. 

If o is large, the probability of a type II error can be large at an alternative 
value yw’ that is of particular concern to an investigator. Suppose we fix a and 
also specify B for such an alternative value. In the sprinkler example, company 
officials might view ww’ = 132 as a very substantial departure from H,: w = 130 
and therefore wish 6(132)=.10 in addition to a=.01. More generally, 
consider the two restrictions P(type I error) = a and B(w’) = B for specified a, pw’, 
and B. Then for an upper-tailed test, the sample size n should be chosen to satisfy 


afc, ¥ | ms 


This implies that 


z critical value that 4. My — 

=a ‘ Sg erry —a 
® captures lower-tail area B . a/Vn 

This equation is easily solved for the desired n. A parallel argument yields the nec- 
essary sample size for lower- and two-tailed tests as summarized in the next box. 
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Alternative Hypothesis Type II Error Probability B(w’) for a 
Level a Test 


My — BH’ 
JELS ft = [hy a, ap ce 
(oy n 
Neeere ie a= P ae 
ion n 
My — My — B' 
H,: w= pb aa, 3p )~0(-2,+ =) 
: P o/Vn iy, 


where ®(z) = the standard normal cdf. 
The sample size n for which a level a test also has B(w’) = B at the 
alternative value p2’ is 


OZ, + 2) |? for a one-tailed 
fli th (upper or lower) test 


O(Z4/2 + Zs) |? for a two-tailed test 
[iy — fle (an approximate solution) 


ip = 


EXAMPLE 8.7 Let w denote the true average tread life of a certain type of tire. Consider testing 
Hy: w = 30,000 versus H,: uw > 30,000 based on a sample of size n = 16 from 
a normal population distribution with o = 1500. A test with a@ = .01 requires 
Zy = Zo) = 2.33. The probability of making a type II error when yz = 31,000 is 


30,000 — 31,000 


B(31,000) = 0(233 + = O(—.34) = .3669 


1500/V 16 
Since z, = 1.28, the requirement that the level .01 test also have 6(31,000) = .1 
necessitates 
1500(2.33 + 1.28) 7 , 
= = (—3.42)° = 29.32 
30,000 — 31,000 
The sample size must be an integer, so n = 30 tires should be used. a 


Large-Sample Tests 


When the sample size is large, the foregoing z tests are easily modified to yield valid 
test procedures without requiring either a normal population distribution or known 
o. The key result was used in Chapter 7 to justify large-sample confidence intervals: 
A large n implies that the standardized variable 


é S/Vn 


has approximately a standard normal distribution. Substitution of the null value jp 
in place of yw yields the test statistic 


».4 = 
va Mo 
S/Vn 
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which has approximately a standard normal distribution when H, is true. The 
P-value is then determined exactly as was previously described in this section (e.g., 
@(z) when the alternative hypothesis is H,:  < u,). Rejecting H, when P-value = a 
gives a test with approximate significance level a. The rule of thumb n > 40 will 
again be used to characterize a large sample size. 


EXAMPLE 8.8 A dynamic cone penetrometer (DCP) is used for measuring material resistance to 
penetration (mm/blow) as a cone is driven into pavement or subgrade. Suppose 
that for a particular application it is required that the true average DCP value for 
a certain type of pavement be less than 30. The pavement will not be used unless 
there is conclusive evidence that the specification has been met. Let’s state and test 
the appropriate hypotheses using the following data (“‘Probabilistic Model for the 
Analysis of Dynamic Cone Penetrometer Test Values in Pavement Structure 
Evaluation,” J. of Testing and Evaluation, 1999: 7-14): 


14.1 145 155 160 160 167 169 17.1 #175 17.8 
17.8 181 182 183 183 190 192 194 20.0 20.0 
20.8 20.8 21.0 21.5 235 27.5 27.5 280 28.3 30.0 
30.0 31.6 31.7 31.7 325 33.55 33.9 35.0 35.0 35.0 
36.7 40.0 400 41.3 41.7 475 500 51.0 51.8 544 
55.0 57.0 


Figure 8.5 shows a descriptive summary obtained from Minitab. The sample mean 
DCP is less than 30. However, there is a substantial amount of variation in the data 
(sample coefficient of variation = s/x = .4265), so the fact that the mean is less 
than the design specification cutoff may be a consequence just of sampling variabil- 
ity. Notice that the histogram does not resemble at all a normal curve (and a normal 
probability plot does not exhibit a linear pattern). However, the large-sample z tests 
do not require a normal population distribution. 


Descriptive Statistics 


Variable: DCP 


Anderson-Darling Normality Test 


A-Squarect 1.902 
P-Value: 0.000 
Mean 28.7615 
StDev 12.2647 
Variance 150.423 
Skewness 0.808264 
Kurtosis —3.9E-01 
N 52 
Minimum 14.1000 
1st Quartile 18.2250 
Median 27.5000 
3rd Quartile 35.0000 
95% Confidence Interval for Mu Maximum 57.0000 
95% Confidence Interval for Mu 
25.3470 3.21761 
95% Confidence Interval for Sigma 
10.2784 15.2098 
Senter 95% Confidence Interval for Median 
95% Confidence Interval for Median 20.0000 31.7000 


Figure 8.5 Minitab descriptive summary for the DCP data of Example 8.8 
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1. ys = true average DCP value 


15. 


16. 


17. 


18. 


2. Ho: w = 30 
3. H,: w < 30 (so the pavement will not be used unless the null hypothesis is 
rejected) 
x — 30 
4. z = 


7 s/Vn 


5. With n = 52, x = 28.76, and s = 12.2647, 


e= = 
12.2647/\/52_—-:1.701 


28.76 — 30 


—1.24 


= —.73 


6. The P-value for this lower-tailed z test is ®(—.73) = .2327. 


7. Since .2327 > .05, H, cannot be rejected. We do not have compelling evidence 
for concluding that ps < 30; use of the pavement is not justified. Note that in 
not rejecting H,, we might possibly have committed a type II error. | 


Determination of 6 and the necessary sample size for these large-sample tests 
can be based either on specifying a plausible value of o and using the previous 
formulas (even though s is used in the test) or on using the methodology to be intro- 
duced in connection with the one-sample ¢ tests discussed in Section 8.3. 


EXERCISES Section 8.2 (15-28) 


Let pw denote the true average reaction time to a certain 
stimulus. For a z test of Hy: w = 5 versus H,: w > 5, 
determine the P-value for each of the following values of 
the z test statistic. 

a. 142 b. 90 c 196 d. 2.48 e. 


Newly purchased tires of a particular type are supposed 
to be filled to a pressure of 30 psi. Let ys denote the true 
average pressure. A test is to be carried out to decide 
whether yx differs from the target value. Determine the 
P-value for each of the following z test statistic values. 

a 2.10 bo —-1.75 ce —55 d. 141 e —53 


=,11 


Answer the following questions for the tire problem in 

Example 8.7. 

a. If x = 30,960 and a level a = .01 test is used, what 
is the decision? 
If a level .01 test is used, what is B(30,500)? 

c. Ifa level .01 test is used and it is also required that 
B(30,500) = .05, what sample size n is necessary? 

d. If x = 30,960, what is the smallest a at which H) can 
be rejected (based on n = 16)? 


Reconsider the paint-drying situation of Example 8.5, in 
which drying time for a test specimen is normally distrib- 
uted with o = 9. The hypotheses Hy: w = 75 versus 
H,: ’ < 75 are to be tested using a random sample of 
n = 25 observations. 


19. 


20. 


a. How many standard deviations (of X) below the null 

value is x = 72.3? 

If x = 72.3, what is the conclusion using a = .002? 

For the test procedure with a = .002, what is B(70)? 

d. If the test procedure with a = .002 is used, what n is 
necessary to ensure that B(70) = .01? 

e. Ifa level .01 test is used with n = 100, what is the 
probability of a type I error when yu = 76? 


os 


The melting point of each of 16 samples of a certain 

brand of hydrogenated vegetable oil was determined, 

resulting in x = 94.32. Assume that the distribution of 

the melting point is normal with o = 1.20. 

a. Test Hy: w= 95 versus H,: w #95 using a two- 
tailed level .01 test. 

b. Ifa level .01 test is used, what is B(94), the probabil- 
ity of a type II error when pp = 94? 

c. What value of n is necessary to ensure that B(94) = .1 
when a = .01? 


Lightbulbs of a certain type are advertised as having an 
average lifetime of 750 hours. The price of these bulbs is 
very favorable, so a potential customer has decided to go 
ahead with a purchase arrangement unless it can be con- 
clusively demonstrated that the true average lifetime is 
smaller than what is advertised. A random sample of 50 
bulbs was selected, the lifetime of each bulb determined, 
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and the appropriate hypotheses were tested using 
Minitab, resulting in the accompanying output. 


Variable WN 
lifetime 50 


Mean StDev SE Mean 
738.44 38.20 5.40 


Z P-Value 
—2.14 0.016 


What conclusion would be appropriate for a significance 
level of .05? A significance level of .01? What signifi- 
cance level and conclusion would you recommend? 


21. The desired percentage of SiO, in a certain type of alu- 
minous cement is 5.5. To test whether the true average 
percentage is 5.5 for a particular production facility, 16 
independently obtained samples are analyzed. Suppose 
that the percentage of SiO, in a sample is normally dis- 
tributed with o = .3 and that x = 5.25. 

a. Does this indicate conclusively that the true average 
percentage differs from 5.5? 

b. If the true average percentage is w = 5.6 and a level 
a = .01 test based on n = 16 is used, what is the 
probability of detecting this departure from H)? 

c. What value of n is required to satisfy a = .01 and 
B(5.6) = .01? 


22. To obtain information on the corrosion-resistance proper- 
ties of a certain type of steel conduit, 45 specimens are 
buried in soil for a 2-year period. The maximum penetra- 
tion (in mils) for each specimen is then measured, yielding 
a sample average penetration of x = 52.7 and a sample 
standard deviation of s = 4.8. The conduits were manufac- 
tured with the specification that true average penetration be 
at most 50 mils. They will be used unless it can be demon- 
strated conclusively that the specification has not been met. 
What would you conclude? 


23. Automatic identification of the boundaries of significant 
structures within a medical image is an area of ongoing 
research. The paper “‘Automatic Segmentation of Medical 
Images Using Image Registration: Diagnostic and 
Simulation Applications” (J. of Medical Engr. and 
Tech., 2005: 53-63) discussed a new technique for such 
identification. A measure of the accuracy of the automatic 
region is the average linear displacement (ALD). The paper 
gave the following ALD observations for a sample of 49 
kidneys (units of pixel dimensions). 


1.38 0.44 109 075 0.66 1.28 0.51 
0.39 0.70 0.46 054 0.83 0.58 0.64 
1.30 0.57 0.43 0.62 100 1.05 0.82 
1.10 0.65 0.99 0.56 0.56 0.64 0.45 
0.82 1.06 0.41 0.58 0.66 0.54 0.83 
0.59 0.51 104 085 045 0.52 0.58 


1.11 0.34 125 038 144 1.28 O51 


a. Summarize/describe the data. 

b. Is it plausible that ALD is at least approximately 
normally distributed? Must normality be assumed 
prior to calculating a CI for true average ALD or test- 
ing hypotheses about true average ALD? Explain. 
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24. 


25. 


26. 


27. 


28. 


c. The authors commented that in most cases the 
ALD is better than or of the order of 1.0. Does the 
data in fact provide strong evidence for concluding 
that true average ALD under these circumstances 
is less than 1.0? Carry out an appropriate test of 
hypotheses. 


d. Calculate an upper confidence bound for true average 
ALD using a confidence level of 95%, and interpret 
this bound. 


Unlike most packaged food products, alcohol beverage 
container labels are not required to show calorie or nutri- 
ent content. The article “What Am I Drinking? The 
Effects of Serving Facts Information on Alcohol 
Beverage Containers” (J. of Consumer Affairs, 2008: 
81-99) reported on a pilot study in which each of 58 
individuals in a sample was asked to estimate the calorie 
content of a 12-0z can of beer known to contain 153 
calories. The resulting sample mean estimated calorie 
level was 191 and the sample standard deviation was 89. 
Does this data suggest that the true average estimated 
calorie content in the population sampled exceeds the 
actual content? Test the appropriate hypotheses at sig- 
nificance level .001. 


Body armor provides critical protection for law 
enforcement personnel, but it does affect balance and 
mobility. The article “Impact of Police Body Armour 
and Equipment on Mobility” (Applied Ergonomics, 
2013: 957-961) reported that for a sample of 52 male 
enforcement officers who underwent an acceleration 
task that simulated exiting a vehicle while wearing 
armor, the sample mean was 1.95 sec, and the sample 
standard deviation was .20 sec. Does it appear that 
true average task time is less than 2 sec? Carry out a 
test of appropriate hypotheses using a significance 
level of .O1. 


The recommended daily dietary allowance for zinc 
among males older than age 50 years is 15 mg/day. The 
article ‘Nutrient Intakes and Dietary Patterns of 
Older Americans: A National Study” (J. of 
Gerontology, 1992: M145-150) reports the following 
summary data on intake for a sample of males age 65-74 
years: n = 115, x = 11.3, and s = 6.43. Does this data 
indicate that average daily zinc intake in the population 
of all males ages 65-74 falls below the recommended 
allowance? 


Show that for any A > 0, when the population distribu- 
tion is normal and o is known, the two-tailed test satis- 
fies B(uy) — A) = B(uo + A), so that B(w’) is symmetric 
about [o. 


For a fixed alternative value w’, show that B(w') > 0 
as n — © for either a one-tailed or a two-tailed z test 
in the case of a normal population distribution with 
known o. 
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os Te wre od tiple t Test 


When n is small, the Central Limit Theorem (CLT) can no longer be invoked to 
justify the use of a large-sample test. We faced this same difficulty in obtaining a 
small-sample confidence interval (CI) for ~ in Chapter 7. Our approach here will 
be the same one used there: We will assume that the population distribution is at 
least approximately normal and describe test procedures whose validity rests on this 
assumption. If an investigator has good reason to believe that the population distribu- 
tion is quite nonnormal, a distribution-free test from Chapter 15 may be appropriate. 
Alternatively, a statistician can be consulted regarding procedures valid for specific 
families of population distributions other than the normal family. Or a bootstrap 
procedure can be developed. 

The key result on which tests for a normal population mean are based was used 
in Chapter 7 to derive the one-sample ¢ Cl: If X,, X,, ..., X, is arandom sample from 
a normal distribution, the standardized variable 


_X=# 
s/Vn 


has a ¢ distribution with n—1 degrees of freedom (df). Consider testing 
Ho: & = [My using the test statistic T= (X — po)/(S/Vn). That is, the test statistic 
results from standardizing X under the assumption that H, is true (using S/ Vn, 
the estimated standard deviation of X, rather than a/ Vn). When H) is true, this 
test statistic has a ¢ distribution with n — 1 df. Knowledge of the test statistic’s 
distribution when H, is true (the “null distribution”) allows us to determine the 
P-value. 

The test statistic is really the same here as in the large-sample case but is la- 
beled T to emphasize that the reference distribution for P-value determination is a ¢ 
distribution with n — 1 df rather than the standard normal (z) distribution. Instead of 
being a z curve area as was the case for large-sample tests, the P-value will now be 
an area under the ¢, _ ; curve (see Figure 8.6). 


The One-Sample t Test 
Null hypothesis: Hp: w = My 


Pee x ~ Mo 
Test statistic value: t = 
s/Vn 
Alternative Hypothesis | P-Value Determination 
Ay pe ko Area under the ¢,, _ , curve to the right of ¢ 
lél2 jl, < fly Area under the ¢,,_ , curve to the left of t 
Hy: bh F po 2 - (Area under the ¢, _ , curve to the right of Iz|) 


Assumption: The data consists of a random sample from a normal population 
distribution. 
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t curve for relevant df 


P-value = area in upper tail 
1. Upper-tailed test 


H, contains the inequality > 


Calculated t 


t curve for relevant df 


P-value = area in lower tail 
2. Lower-tailed test 
H,, contains the inequality < 


Calculated t 


P-value = sum of area in two tails 


t curve for relevant df 


3. Two-tailed test 


H,, contains the inequality # 
| 


—_—. —_—_ ——_— 
| 


Calculated t, —t 


Figure 8.6 P-values for t tests 


Unfortunately the table of f critical values that we used for confidence and prediction 
interval calculations in Chapter 7 does not provide much information about ¢ curve 
tail areas. This is because for each f¢ distribution there are values for only the seven 
most commonly used tail areas: .10, .05, .025, .01, .005, .001, and .0005. P-value 
determination would be straightforward if we had a table of tail areas (or alterna- 
tively, cumulative areas) that resembled our z table: for each different ¢ distribution, 
the area under the corresponding curve to the right (or the left) of values 0.00, 0.01, 
0.02, 0.03, ... , 3.97, 3.98, 3.99, and finally 4.00. But this would necessitate an entire 
page of text for each different ¢ distribution. 

So we have included another f table in Appendix Table A.8. It contains a 
tabulation of upper-tail ¢ curve areas but with less decimal accuracy than what the 
z table provides. Each different column of the table is for a different number of df, 
and the rows are for calculated values of the test statistic tf ranging from 0.0 to 4.0 
in increments of .1. For example, the number .074 appears at the intersection of the 
1.6 row and the 8 df column. Thus the area under the 8 df curve to the right of 1.6 
(an upper-tail area) is .074. Because ¢ curves are symmetric about 0, .074 is also the 
area under the 8 df curve to the left of — 1.6. 

Suppose, for example, that a test of H,): wu = 100 versus H,: uw > 100 is based 
on the 8 df ¢ distribution. If the calculated value of the test statistic is t = 1.6, then the 
P-value for this upper-tailed test is .074. Because .074 exceeds .05, we would not be 
able to reject Hy at a significance level of .05. If the alternative hypothesis is H,: uw < 100 
and a test based on 20 df yields t = —3.2, then Appendix Table A.7 shows that the 
P-value is the captured lower-tail area .002. The null hypothesis can be rejected at 
either level .05 or .01. In the next chapter, we will present a ¢ test for hypotheses about a 
difference between two population means. Suppose the relevant hypotheses are Hp: 1, 
— fy = O versus H,: ww, — , ~ 0; the null hypothesis states that the means of the two 
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populations are identical, whereas the alternative hypothesis states that they are differ- 
ent without specifying a direction of departure from H,. If af test is based on 20 df and 

= 3.2, then the P-value for this two-tailed test is 2(.002) = .004. This would also be 
the P-value for t = —3.2. The tail area is doubled because values both larger than 3.2 
and smaller than —3.2 are more contradictory to H, than what was calculated (values 
farther out in either tail of the t curve). 


EXAMPLE 8.9 Carbon nanofibers have potential application as heat-management materials, for 
composite reinforcement, and as components for nanoelectronics and photonics. 
The accompanying data on failure stress (MPa) of fiber specimens was read from a 
graph in the article “Mechanical and Structural Characterization of Electrospun 
PAN-Derived Carbon Nanofibers” (Carbon, 2005: 2175-2185). 


300 312 327 368 400 425 470 556 573: 575 
580 589 626 637 690 715 757 891 900 


Summary quantities include n = 19, x = 562.68, s = 180.874, s/Vn = 41.495. 
Does the data provide compelling evidence for concluding that true average failure 
stress exceeds 500 MPa? 

Figure 8.7 shows a normal probability plot of the data; the substantial linear 
pattern indicates that a normal population distribution of failure stress is quite plausi- 
ble, giving us license to employ the one-sample ¢ test (the box to the right of the plot 
gives information about a formal test of the hypothesis that the population distribu- 
tion is normal; this will be discussed in Chapter 14). 


Mean 562.7 
StDev 180.9 
N 19 
RJ 0.982 
P-Value >0.100 


Percent 


100 200 300 400 500 600 700 800 900 ~=—-: 1000 
Failure stress 


Figure 8.7 Normal probability plot of the failure stress data B 


Let’s carry out a test of the relevant hypotheses using a significance level of .05. 


1. The parameter of interest is 4 = the true average failure stress 
2. The null hypothesis is Hj: up = 500 


3. The appropriate alternative hypothesis is H,: 2 > 500 (so we’ll believe that true 
average failure stress exceeds 500 only if the null hypothesis can be rejected). 

4. The one-sample f test statistic is T = (X — 500)/(S/Vn). Its value t for the 
given data results from replacing X by x and S by s. 


5. The test-statistic value is t = (562.68 — 500)/41.495 = 1.51 
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6. The test is based on 19 — 1 = 18 df. The entry in that column and the 1.5 
row of Appendix Table A.8 is .075. Since the test is upper-tailed (because > 
appears in H,), it follows that P-value ~ .075 (Minitab says .074). 


7. Because .075 > .05, there is not enough evidence to justify rejecting the null 
hypothesis at significance level .05. Rather than conclude that the true aver- 
age failure stress exceeds 500, it appears that sampling variability provides a 
plausible explanation for the fact that the sample mean exceeds 500 by a rather 
substantial amount. BH 


EXAMPLE 8.10 Many deleterious effects of smoking on health have been well documented. The 
article “Smoking Abstinence Impairs Time Estimation Accuracy in Cigarette 
Smokers” (Psychopharmacology Bull., 2003: 90-95) described an investigation 
into whether time perception, an indicator of a person’s ability to concentrate, is 
impaired during nicotine withdrawal. After a 24-hour smoking abstinence, each of 
20 smokers was asked to estimate how much time had elapsed during a 45-second 
period. The following data on perceived elapsed time is consistent with summary 
quantities given in the cited article. 


69 65 72 73 59 D5 39 o2 67 57 
56 50 70 47 56 45 70 64 67 53 


A normal probability plot of this data shows a very substantial linear pattern. Let’s 
carry out a test of hypotheses at significance level .05 to decide whether true average 
perceived elapsed time differs from the known time 45. 


1. mw = true average perceived elapsed time for all smokers exposed to the 
described experimental regimen 

. Ao: w= 45 

. Hy: w #45 

. t= ( — 45)/(s/Vn) 

. With x = 59.30 and s/Vn = 9.84/\/20 = 2.200, the test statistic value is t = 
14.3/2.200 = 6.50. 


6. The P-value for a two-tailed test is twice the area under the 19 df ¢ curve to the 
right of 6.50. Since Table A.8 shows that the area under this ¢ curve to the right 
of 4.0 is 0, the area to the right of 6.50 is certainly 0. The P-value is then 2(0) = 
0 (.00000 according to software). 


nk WwW NY 


7. A P-value as small as what we obtained argues very strongly for rejection of 
H, at any reasonable significance level, and in particular at significance level 
.05. The difference between the sample mean and its expected value when H, is 
true cannot plausibly be explained simply by chance variation. The true average 
perceived elapsed time is evidently something other than 45, so nicotine with- 
drawal does appear to impair perception of time. a 


B and Sample Size Determination 


The calculation of B at the alternative value yw’ for a normal population distribution 
with known o was carried out by converting the inequality P-value > a to a state- 
ment about x (e.g., ¥< by + Zz, ° o/ Vn) and then subtracting pw’ to standardize 
correctly. An equivalent approach involves noting that when pp = wp’, the test statis- 
tic Z = (X — Lo) /(o/ Vn) still has a normal distribution with variance 1, but now 
the mean value of Z is given by (w’ — fy)/(o/Vn). That is, when w = pw’, the test 
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statistic still has a normal distribution though not the standard normal distribution. 
Because of this, B(j’) is an area under the normal curve corresponding to mean 
value (w’ — py)/(o/ V/n) and variance 1. Both a and B involve working with nor- 
mally distributed variables. 

The calculation of B(w’) for the ¢ test is much less straightforward. This is 
because the distribution of the test statistic T= (X — by)/(S/ Vn) is quite compli- 
cated when H, is false and H, is true. Thus, for an upper-tailed test, determining 


B(u') = PIT <t,,,-; when w = w' rather than jy) 


involves integrating a very unpleasant density function. This must be done numeri- 
cally. The results are summarized in graphs of 6 that appear in Appendix Table A.17. 
There are four sets of graphs, corresponding to one-tailed tests at level .05 and level 
.O1 and two-tailed tests at the same levels. 

To understand how these graphs are used, note first that both B and the 
necessary sample size n are as before functions not just of the absolute difference 
|i4o — | but of d= |u, — w'|/o. Suppose, for example, that |u, — w’| = 10. 
This departure from H), will be much easier to detect (smaller B) when o = 2, 
in which case wy, and pw’ are 5 population standard deviations apart, than when 
o = 10. The fact that B for the ¢ test depends on d rather than just |f1) — p’| is 
unfortunate, since to use the graphs one must have some idea of the true value 
of o. A conservative (large) guess for o will yield a conservative (large) value 
of B(w') and a conservative estimate of the sample size necessary for prescribed 
a@ and B(w'). 

Once the alternative ww’ and value of o are selected, d is calculated and its 
value located on the horizontal axis of the relevant set of curves. The value of B 
is the height of the n — 1 df curve above the value of d (visual interpolation is 
necessary if m — | is not a value for which the corresponding curve appears), as 
illustrated in Figure 8.8. 


Bcurve for n — | df 


B when p= pw’ > 


Value of d corresponding to specified alternative j' 


Figure 8.8 A typical 6 curve for the t test 


Rather than fixing 7 (i.e., 7 — 1, and thus the particular curve from which B is 
read), one might prescribe both a (.05 or .01 here) and a value of 6 for the chosen 
pw’ and o. After computing d, the point (d, B) is located on the relevant set of graphs. 
The curve below and closest to this point gives n — | and thus n (again, interpolation 
is often necessary). 
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EXAMPLE 8.11 The true average voltage drop from collector to emitter of insulated gate bipolar 
transistors of a certain type is supposed to be at most 2.5 volts. An investigator 
selects a sample of n = 10 such transistors and uses the resulting voltages as a basis 
for testing Hy: w = 2.5 versus H,: w > 2.5 using a ¢ test with significance level 
a = .05. If the standard deviation of the voltage distribution is 7 = .100, how likely is 
it that H, will not be rejected when in fact w = 2.6? With d = |2.5 — 2.6|/.100 = 1.0, 
the point on the B curve at 9 df for a one-tailed test with a = .05 above 1.0 has a 
height of approximately .1, so B ~ .1. The investigator might think that this is too 
large a value of 6 for such a substantial departure from H) and may wish to have 
B = .05 for this alternative value of jw. Since d = 1.0, the point (d, B) = (1.0, .05) 
must be located. This point is very close to the 14 df curve, so using n = 15 will give 
both a = .05 and 6B = .05 when the value of yz is 2.6 and o = .10. A larger value of 
o would give a larger B for this alternative, and an alternative value of yw closer to 
2.5 would also result in an increased value of B. | 


Most of the widely used statistical software packages are capable of calculat- 
ing type II error probabilities. They generally work in terms of power, which is 
simply | — B. A small value of 6 (close to 0) is equivalent to large power (near 1). 
A powerful test is one that has high power and therefore good ability to detect when 
the null hypothesis is false. 

As an example, we asked Minitab to determine the power of the upper-tailed 
test in Example 8.11 for the three sample sizes 5, 10, and 15 when a = .05, 0 = .10, 
and the value of yw is actually 2.6 rather than the null value 2.5—a “difference” of. 
2.6—2.5 = .1. We also asked the software to determine the necessary sample size for 
a power of .9 (6 = .1) and also .95. Here is the resulting output: 


Power and Sample Size 


Testing mean = null (versus > null) 

Calculating power for mean = null + difference 

Alpha = 0.05 Assumed standard deviation = 0.1 
Sample 

Difference Size Power 

0.1 5 0579737 

0.1 10 0.897517 

O.1 15 0.978916 


Sample Target 


Actual 
Difference Size Power Power 
Or EL Ae 0.90 0.924489 
O's 13) 0.95 0.959703 


The power for the sample size n = 10 is a bit smaller than .9. So if we insist that the 
power be at least .9, a sample size of 11 is required and the actual power for that n is 
roughly .92. The software says that for a target power of .95, a sample size of n = 13 
is required, whereas eyeballing our 6 curves gave 15. When available, this type of 
software is more reliable than the curves. Finally, Minitab now also provides power 
curves for the specified sample sizes, as shown in Figure 8.9. Such curves illustrate 
how the power increases for each sample size as the actual value of 4. moves farther 
and farther away from the null value. 
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Power Curves for 1-Sample t Test 


See 10 
— 15 
Assumptions 
Alpha 0.05 


StDev 0.1 
Alternative > 


Power 


0.00 0.05 0.10 0.15 0.20 
Difference 


Figure 8.9 Power curves from Minitab for the t test of Example 8.11 


Variation in P-values 


The P-value resulting from carrying out a test on a selected sample is not the 
probability that H) is true, nor is it the probability of rejecting the null hypoth- 
esis. Once again, it is the probability, calculated assuming that H) is true, of 
obtaining a test statistic value at least as contradictory to the null hypothesis as 
the value that actually resulted. For example, consider testing Hp: wu = 50 against 
Hy: w < 50 using a lower-tailed f test based on 20 df. If the calculated value of the 
test statistic is t = —2.00, then 


P-value = P(T < —2.00 when ps = 50) 
area under the f,, to the left of —2.00 = .030 


But if a second sample is selected, the resulting value of ¢ will almost surely be dif- 
ferent from —2.00, so the corresponding P-value will also likely differ from .030. 
Because the test statistic value itself varies from one sample to another, the P-value 
will also vary from one sample to another. That is, the test statistic is a random 
variable, and so the P-value will also be a random variable. A first sample may give 
a P-value of .030, a second sample may result in a P-value of .117, a third may yield 
.061 as the P-value, and so on. 

If H, is false, we hope the P-value will be close to 0 so that the null hypothesis 
can be rejected. On the other hand, when H) is true, we'd like the P-value to exceed 
the selected significance level so that the correct decision to not reject Hy is made. 
The next example presents simulations to show how the P-value behaves both when 
the null hypothesis is true and when it is false. 


EXAMPLE 8.12 The fuel efficiency (mpg) of any particular new vehicle under specified driving con- 
ditions may not be identical to the EPA figure that appears on the vehicle’s sticker. 
Suppose that four different vehicles of a particular type are to be selected and driven 
over a certain course, after which the fuel efficiency of each one is to be determined. 
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Let ps denote the true average fuel efficiency under these conditions. Consider testing 
Hy: w = 20 versus H,: w > 20 using the one-sample f test based on the resulting 
sample. Since the test is based on n — 1 = 3 degrees of freedom, the P-value for an 
upper-tailed test is the area under the f curve with 3 df to the right of the calculated t. 

Let’s first suppose that the null hypothesis is true. We asked Minitab to 
generate 10,000 different samples, each containing 4 observations, from a normal 
population distribution with mean value = 20 and standard deviation o = 2. The 
first sample and resulting summary quantities were 


x, = 20.830, x, = 22.232, x, = 20.276, x, = 17.718 
_ 20.264 -— 20 | 
1.8864/V4 


The P-value is the area under the 3-df t curve to the right of .2799, which accord- 
ing to Minitab is .3989. Using a significance level of .05, the null hypothesis 
would of course not be rejected. The values of ¢ for the next four samples were 
—1.7591, .6082, —.7020, and 3.1053, with corresponding P-values .912, .293, .733, 
and .0265. 

Figure 8.10(a) shows a histogram of the 10,000 P-values from this simula- 
tion experiment. About 4.5% of these P-values are in the first class interval from 0 
to .05. Thus when using a significance level of .05, the null hypothesis is rejected 
in roughly 4.5% of these 10,000 tests. If we continued to generate samples and 
carry out the test for each sample at significance level .05, in the long run 5% of 
the P-values would be in the first class interval. This is because when H, is true 
and a test with significance level .05 is used, by definition the probability of reject- 
ing A, is .05. 

Looking at the histogram, it appears that the distribution of P-values is rela- 
tively flat. In fact, it can be shown that when H, is true, the probability distribution of 
the P-value is a uniform distribution on the interval from 0 to 1. That is, the density 
curve is completely flat on this interval, and thus must have a height of | if the total 
area under the curve is to be |. Since the area under such a curve to the left of .05 is 
(.05)(1) = .05, we again have that the probability of rejecting H, when it is true that 
it is .05, the chosen significance level. 

Now consider what happens when H, is false because w = 21. We again had 
Minitab generate 10,000 different samples of size 4 (each from a normal distribu- 
tion with = 21 and o = 2), calculate t = (x — 20)/(s/V4) for each one, and then 
determine the P-value. The first such sample resulted in x = 20.6411, s = .49637, 
t = 2.5832, P-value = .0408. Figure 8.10(b) gives a histogram of the resulting 
P-values. The shape of this histogram is quite different from that of Figure 8.10(a)— 
there is a much greater tendency for the P-value to be small (closer to 0) when 
jw = 21 than when w = 20. Again H, is rejected at significance level .05 whenever 
the P-value is at most .05 (in the first class interval). Unfortunately, this is the case 
for only about 19% of the P-values. So only about 19% of the 10,000 tests correctly 
reject the null hypothesis; for the other 81%, a type II error is committed. The diffi- 
culty is that the sample size is quite small and 21 is not very different from the value 
asserted by the null hypothesis. 

Figure 8.10(c) illustrates what happens to the P-value when H) is false 
because pp = 22 (still with m = 4 and o = 2). The histogram is even more concen- 
trated toward values close to 0 than was the case when w = 21. In general, as w 
moves farther to the right of the null value 20, the distribution of the P-value will 
become more and more concentrated on values close to 0. Even here a bit fewer 
than 50% of the P-values are smaller than .05. So it is still slightly more likely than 


xX = 20.264 s = 1.8864 2799 
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Figure 8.10 P-value simulation results for Example 8.12 


not that the null hypothesis is incorrectly not rejected. Only for values of 4 much 
larger than 20 (e.g., at least 24 or 25) is it highly likely that the P-value will be 
smaller than .05 and thus give the correct conclusion. 

The big idea of this example is that because the value of any test statistic is 
random, the P-value will also be a random variable and thus have a distribution. 
The farther the actual value of the parameter is from the value specified by the 
null hypothesis, the more the distribution of the P-value will be concentrated on 
values close to 0 and the greater the chance that the test will correctly reject Hy 
(corresponding to smaller B). | 


Whenever the observed value of a statistic such as X or pf is reported, it is good 
statistical practice to include a quantitative measure of the statistic’s precision, e.g., 
that the estimated standard error of X is s/n. The P-value itself is a statistic—its 
value can be calculated once sample data is available and a particular test procedure 
is selected, and before such data is in hand, the P-value is subject to randomness. 
So it would be nice to have available o> or an estimate of this standard deviation. 
Unfortunately the sampling distribution of a P-value is in general quite compli- 
cated. The simulation results of Example 8.12 suggest that the sampling distribution 
is quite skewed when H) is false (it is uniformly distributed on (0,1) when Hp is true 
and the test statistic has a continuous distribution, e.g., a ¢ distribution). A standard 
deviation is not as easy to interpret and use when there is substantial non-normality. The 
statisticians Dennis Boos and Leonard Stefanski investigated the random behavior of 
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the P-value in their article ‘““P-Value Precision and Reproducibility” (The American 
Statistician, 2011: 213-221). To address non-normality, they focused on the quantity 
—log(P-value). The log-transformed P-value does for many test procedures have 
approximately a normal distribution when n is large. 

Suppose application of a particular test procedure to sample data results in 
a P-value of .001. Then H, would be rejected using either a significance level of 
.05 or .O1. If a new sample from the same population distribution is then selected, 
how likely is it that the P-value for this new data will lead to rejection of Hp at a 
significance level of .05 or .01? This is what the authors of the foregoing article 
meant by “reproducibility”: How likely is it that a new sample will lead to the same 
conclusion as that reached using the original sample? The answer to this question 
depends on the population distribution, the sample size, and the test procedure used. 
Nevertheless, based on their investigations, the authors suggested the following 
general guidelines: 


If the P-value for the original data is .0001, then P(new P-value = .05) ~ .97, 
whereas this probability is roughly .91 if the original P-value is .001 and it is 
roughly .73 when the original P-value is .01. 


Particularly when the original P-value is around .01, there is a reasonably good 
chance that a new sample will not lead to rejection of H, at the 5% significance 
level. Thus unless the original P-value is really small, it would not be surprising to 
have a new sample contradict the inference drawn from the original data. A P-value 
not too much smaller than a chosen significance level such as .05 or .01 should be 
viewed with some caution! 


EXERCISES Section 8.3 (29-41) 


29. The true average diameter of ball bearings of a certain H,: » > 20 will be based on a random sample of size n 
type is supposed to be .5 in. A one-sample f test will be from a normal population distribution. What conclusion 
carried out to see whether this is the case. What conclu- is appropriate in each of the following situations? 
sion is appropriate in each of the following situations? a. n= 15,t=3.2,a = .05 
a. n= 13,t=1.6,a = .05 b. n=9,t= 1.8,a = .01 
b. n= 13,t= —1.6,a = .05 c« n= 24,t= —.2 
c n= 25,t= —2.6,0 = 01 32. The relative conductivity of a semiconductor device is 
d. n= 25,1= —3.9 determined by the amount of impurity “doped” into the 

30. A sample of n sludge specimens is selected and the pH device during its manufacture. A silicon diode to be used 
of each one is determined. The one-sample ¢ test will for a specific purpose requires an average cut-on voltage 
then be used to see if there is compelling evidence for of .60 V, and if this is not achieved, the amount of impu- 
concluding that true average pH is less than 7.0. What rity must be adjusted. A sample of diodes was selected 
conclusion is appropriate in each of the following and the cut-on voltage was determined. The accompany- 
situations? ing SAS output resulted from a request to test the appro- 
a. n=6,t = —2.3,a = 05 priate hypotheses. 

b. n= 15,t= —-3.1,a =.01 
& HR 1 6 1305 N Mean Std Dev T Prob. > |tT| 
15 0.0453333 0.0899100 1.9527887 0.0711 
d. n=6,t=.7,a0 = .05 
e, n= 6, X= 6.68, s/Vn = 0820 [Note: SAS explicitly tests Hy: 4 = 0,so to test Hy: w = .60, 
31. The paint used to make lines on roads must reflect the null value .60 must be subtracted from each x,; the 


enough light to be clearly visible at night. Let denote 
the true average reflectometer reading for a new type of 
paint under consideration. A test of Hy: w = 20 versus 
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reported mean is then the average of the (x; — .60) values. 
Also, SAS’s P-value is always for a two-tailed test.] What 
would be concluded for a significance level of .01? .05? .10? 


33. The article “The Foreman’s View of Quality Control” 
(Quality Engr., 1990: 257-280) described an investiga- 
tion into the coating weights for large pipes resulting from 
a galvanized coating process. Production standards call 
for a true average weight of 200 lb per pipe. The accom- 
panying descriptive summary and boxplot are from 


Minitab. 
Variable WN Mean Median TrMean StDev SEMean 
ctg wt 30 206.73 206.00 206.81 6.35 1.16 
Variable Min Max Ql Q3 
ctg wt 193.00 218.00 202.75 212.00 
Coating weight 
190 200 210 220 


a. What does the boxplot suggest about the status of the 
specification for true average coating weight? 

b. A normal probability plot of the data was quite straight. 
Use the descriptive output to test the appropriate 
hypotheses. 


34. The following observations are on stopping distance (ft) 
of a particular truck at 20 mph under specified experi- 
mental conditions (“Experimental Measurement of 
the Stopping Performance of a Tractor-Semitrailer 
from Multiple Speeds,’ NHTSA, DOT HS 811 488, 
June 2011): 


32.1 30.6 


The cited report states that under these conditions, the 
maximum allowable stopping distance is 30. A normal 
probability plot validates the assumption that stopping 
distance is normally distributed. 


314 304 31.0 31.9 


a. Does the data suggest that true average stopping 
distance exceeds this maximum value? Test the 
appropriate hypotheses using a = .01. 

b. Determine the probability of a type II error when a = 
01, o = .65, and the actual value of x is 31. Repeat 
this for x = 32 (use either statistical software or 
Table A.17). 

c. Repeat (b) using a = .80 and compare to the results 
of (b). 

d. What sample size would be necessary to have a = 
.O1 and B = .10 when pw = 31 and o = .65? 


35. The article “Uncertainty Estimation in Railway Track 
Life-Cycle Cost” (J. of Rail and Rapid Transit, 2009) 
presented the following data on time to repair (min) a rail 
break in the high rail on a curved track of a certain rail- 
way line. 


159 120 480 149 270 547 340 43 228 202 240 218 


A normal probability plot of the data shows a reason- 
ably linear pattern, so it is plausible that the population 


36. 


37. 


38. 


8.3 The One-Sample tTest 345 


distribution of repair time is at least approximately nor- 

mal. The sample mean and standard deviation are 249.7 

and 145.1, respectively. 

a. Is there compelling evidence for concluding that true 
average repair time exceeds 200 min? Carry out a 
test of hypotheses using a significance level of .05. 

b. Using o = 150, what is the type II error probability 
of the test used in (a) when true average repair time 
is actually 300 min? That is, what is B(300)? 


Have you ever been frustrated because you could not get a 

container of some sort to release the last bit of its con- 

tents? The article “Shake, Rattle, and Squeeze: How 

Much Is Left in That Container?” (Consumer Reports, 

May 2009: 8) reported on an investigation of this issue for 

various consumer products. Suppose five 6.0 oz tubes of 

toothpaste of a particular brand are randomly selected and 
squeezed until no more toothpaste will come out. Then 
each tube is cut open and the amount remaining is 
weighed, resulting in the following data (consistent with 
what the cited article reported): .53, .65, .46, .50, .37. Does 

it appear that the true average amount left is less than 10% 

of the advertised net contents? 

a. Check the validity of any assumptions necessary for 
testing the appropriate hypotheses. 

b. Carry out a test of the appropriate hypotheses using 
a significance level of .05. Would your conclusion 
change if a significance level of .01 had been used? 

c. Describe in context type I and II errors, and say 
which error might have been made in reaching a 
conclusion. 


The accompanying data on cube compressive strength 
(MPa) of concrete specimens appeared in the article 
“Experimental Study of Recycled Rubber-Filled 
High-Strength Concrete” (Magazine of Concrete 
Res., 2009: 549-556): 


112.3 97.0 
99.2 95.8 


92.7 86.0 
103.5 89.0 


102.0 
86.7 


a. Is it plausible that the compressive strength for this 
type of concrete is normally distributed? 

b. Suppose the concrete will be used for a particular 
application unless there is strong evidence that true 
average strength is less than 100 MPa. Should the 
concrete be used? Carry out a test of appropriate 
hypotheses. 


A random sample of soil specimens was obtained, and 
the amount of organic matter (%) in the soil was deter- 
mined for each specimen, resulting in the accompanying 
data (from “Engineering Properties of Soil,’ Soil 
Science, 1998: 93-102). 


1.10 5.09 0.97 1.59 4.60 0.32 0.55 1.45 
0.14 447 1.20 3.50 5.02 4.67 5.22 2.69 
3.98 3.17 3.03 2.21 069 447 3.31 1.17 
0.76 1.17 1.57 2.62 1.66 2.05 
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The values of the sample mean, sample standard devia- 
tion, and (estimated) standard error of the mean are 
2.481, 1.616, and .295, respectively. Does this data sug- 
gest that the true average percentage of organic matter 
in such soil is something other than 3%? Carry out a 
test of the appropriate hypotheses at significance level 
.10. Would your conclusion be different if a = .05 had 
been used? [Note: A normal probability plot of the data 
shows an acceptable pattern in light of the reasonably 
large sample size. ] 


40. 


c. Supposing that o = .5, determine and interpret the 
power of the test in (a) for the actual value of 
stated in (b). 


Polymer composite materials have gained popularity 
because they have high strength to weight ratios and are 
relatively easy and inexpensive to manufacture. However, 
their nondegradable nature has prompted development of 
environmentally friendly composites using natural mate- 
rials. The article “Properties of Waste Silk Short 
Fiber/Cellulose Green Composite Films” (J. of 


39. Reconsider the accompanying sample data on expense Composite Materials, 2012: 123-127) reported that for 
ratio (%) for large-cap growth mutual funds first intro- a sample of 10 specimens with 2% fiber content, the 
duced in Exercise 1.53. sample mean tensile strength (MPa) was 51.3 and the 

le standard deviati 122, the t 

0.52 1.06 1.26 2.17 1.55 0.99 1.10 1.07 1.81 2.05 ee ae 

average strength for 0% fibers (pure cellulose) is known 

0.91 0.79 1.39 0.62 1.52 1.02 1.10 1.78 1.01 1.15 


to be 48 MPa. Does the data provide compelling evi- 
dence for concluding that true average strength for the 
WSF/cellulose composite exceeds this value? 


A normal probability plot shows a reasonably linear 
pattern. 


41. A spectrophotometer used for measuring CO concentra- 
tion [ppm (parts per million) by volume] is checked for 
accuracy by taking readings on a manufactured gas 
(called span gas) in which the CO concentration is very 
precisely controlled at 70 ppm. If the readings suggest that 
the spectrophotometer is not working properly, it will 
have to be recalibrated. Assume that if it is properly cali- 
brated, measured concentration for span gas samples is 
normally distributed. On the basis of the six readings—85, 
77, 82, 68, 72, and 69—1is recalibration necessary? Carry 
out a test of the relevant hypotheses using a = .05. 


a. Is there compelling evidence for concluding that the 
population mean expense ratio exceeds 1%? Carry out 
a test of the relevant hypotheses using a significance 
level of .01. 

b. Referring back to (a), describe in context type I 
and II errors and say which error you might have 
made in reaching your conclusion. The source 
from which the data was obtained reported that 
p = 1.33 for the population of all 762 such funds. 
So did you actually commit an error in reaching 
your conclusion? 


8.4 Tests Concerning a Population Proportion 


Let p denote the proportion of individuals or objects in a population who pos- 
sess a specified property (e.g., college students who graduate without any debt, or 
computers that do not need service during the warranty period). If an individual or 
object with the property is labeled a success (S), then p is the population proportion 
of successes. Tests concerning p will be based on a random sample of size n from 
the population. Provided that n is small relative to the population size, X (the num- 
ber of S’s in the sample) has (approximately) a binomial distribution. Furthermore, 
if n itself is large [yp = 10 and n(1 — p) = 10], both X and the estimator p = X/n 
are approximately normally distributed. We first consider large-sample tests based 
on this latter fact and then turn to the small-sample case that directly uses the bino- 
mial distribution. 


Large-Sample Tests 


Large-sample tests concerning p are a special case of the more general 
large-sample procedures for a parameter 6. Let 6 be an estimator of @ that is 
(at least approximately) unbiased and has approximately a normal distribu- 
tion. The null hypothesis has the form H,: 8 = 0) where 0, denotes a number 
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(the null value) appropriate to the problem context. Suppose that when H) is 
true, the standard deviation of 6, oy, involves no unknown parameters. For 
example, if 6 = w and 6 = X, 04 = Oy = o/ Vn, which involves no unknown 
parameters only if the value of o is known. A large-sample test statistic results 
from standardizing 6 under the assumption that Hy is true (so that E(6) = 0): 


6 - 6, 


0% 


Test statistic: Z = 


If the alternative hypothesis is H,: 6 > 6), an upper-tailed test whose significance 
level is approximately a has P-value = 1 — (z). The other two alternatives, 
H,: 0 < 0) and H,: 6 # @o, are tested using a lower-tailed z test and a two-tailed z 
test, respectively. 

In the case 6 = p, a9 will not involve any unknown parameters when H, is true, 
but this is atypical. When og does involve unknown parameters, it is often possible to 
use an estimated standard deviation Sg in place of og and still have Z approximately 
normally distributed when H) is true (because this substitution does not increase vari- 
ability in Z by very much). The large-sample test of the previous section furnishes an 
example of this: Because o is usually unknown, we use s§ = sz = s/Vn in place of 
o/\n in the denominator of z. 

The estimator p = X/n is unbiased (E(p) = p) and its standard deviation is 
a, = Vp(l - p)/n. These facts along with approximate normality were used in 
Section 7.2 to obtain a confidence interval for p. When H) is true, E(p) = p, and 
oa, = Vp(l — Po)/n, SO g;, does not involve any unknown parameters. It then fol- 
lows that when n is large and H, is true, the test statistic 


= P— Po 
Vpo(l — po)/n 


has approximately a standard normal distribution. The P-value for the test is 
then a z curve area, just as it was in the case of large-sample z tests concern- 
ing w. Its calculation depends on which of the three inequalities in H, is under 
consideration. 


Null hypothesis: Hp: p = pp 


Test statistic value: z = ae 
VPo(l — po)/n 

Alternative Hypothesis P-Value Determination 

lols (p= yap Area under the standard normal curve to the 
right of z 

Joke joy, Area under the standard normal curve to the 
left of z 

A DF D5 2-(Area under the standard normal curve to 


the right of |z|) 
These test procedures are valid provided that np) = 10 and n(1 — po) = 10. 


They are referred to as upper-tailed, lower-tailed, and two-tailed, respectively, 
for the three different alternative hypotheses. 
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EXAMPLE 8.13 Student use of cell phones during class is perceived by many faculty to be an 
annoying but perhaps harmless distraction. However, the use of a phone to text 
during an exam is a serious breach of conduct. The article ‘“The Use and Abuse of 
Cell Phones and Text Messaging During Class: A Survey of College Students” 
(College Teaching, 2012: 1-9) reported that 27 of the 267 students in a sample 
admitted to doing this. Can it be concluded at significance level .001 that more than 
5% of all students in the population sampled had texted during an exam? 


1. The parameter of interest is the proportion p of the sampled population that has 
texted during an exam. 


2. The null hypothesis is Hy: p = .05 
3. The alternative hypothesis is H,: p > .05 


4. Since npy = 267(.05) = 13.35 = 10 and ngy = 267(.95) = 253.65 = 10, 
the large-sample z test can be used. The test statistic value is 


z= (p — .05)/V(.05)(.95)/n. 

5. p = 27/267 = .1011, from which z = (.1011 — .05)/V/(.05)(.95)/267 = 
.0511/.0133 = 3.84 

6. The P-value for this upper-tailed z test is 1 — ®(3.84) < 1 — ©(3.49) = .0003 
(software gives .000062). 


7. The null hypothesis is resoundingly rejected because P-value < .0003 = .001 = a. 
The evidence for concluding that the population percentage of students who text 
during an exam exceeds 5% is very compelling. The cited article’s abstract con- 
tained the following comment: “The majority of the students surveyed believe that 
instructors are largely unaware of the extent to which texting and other cell phone 
activities engage students in the classroom.” Maybe it is time for instructors, admin- 
istrators, and student leaders to become proactive about this issue | 


6 and Sample Size Determination When H) is true, the test statistic Z has approxi- 
mately a standard normal distribution. Now suppose that H, is not true and that p = p’. 
Then Z still has approximately a normal distribution (because it is a linear function of 
Pp), but its mean value and variance are no longer 0 and 1, respectively. Instead, 


P' Po V(Z) = p'(L = p')/n 
Vpo(1 — py)/n Pol — py)/n 


The null hypothesis will not be rejected if P-value > a. For an upper-tailed z 
test (inequality > in H,), we argued previously that this is equivalent to z < z,. 
The probability of a type II error (not rejecting Hy when it is false) is B(p’) = 
P(Z < z, when p = p’). This can be computed by using the given mean and vari- 
ance to standardize and then referring to the standard normal cdf. In addition, if it 
is desired that the level a test also have B(p’) = B for a specified value of B, this 
equation can be solved for the necessary n as in Section 8.2. General expressions for 
B(p') and n are given in the accompanying box. 


E(Z) = 


Alternative Hypothesis B(p') 
0 = Pea mn 
Vp" = p')/n 
ie of =e Ai pn 
Vp" = p/n 


v3 Be P > Po 


A, P < Po 
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Jobs jase ey 


of” — P' + ZapV pol — ue 
Vp — p')/n 
- of ag ad 
Vp'(1 — p')/n 


The sample size n for which the level a test also satisfies B(p') = B is 


Eng Py ee p) 


P’— Po 
Ee Pol = Po) he Ze p'(l = i) two-tailed test (an 


= approximate solution) 
P ~ Po 


2 
| one-tailed test 


n= 


EXAMPLE 8.14 A package-delivery service advertises that at least 90% of all packages brought to 
its office by 9 A.m. for delivery in the same city are delivered by noon that day. Let 
p denote the true proportion of such packages that are delivered as advertised and 
consider the hypotheses Hp: p = .9 versus H,: p < .9. If only 80% of the packages 
are delivered as advertised, how likely is it that a level .01 test based on n = 225 
packages will detect such a departure from H,)? What should the sample size be to 
ensure that B(.8) = .01? With a = .01, py = .9, p’ = .8, and n = 225, 


J 8-— 2:33 oO 


8)=1-@ 
a ( V(8)(.2)/225 
= 1 — ©(2.00) = .0228 


Thus the probability that H) will be rejected using the test when p = .8 is .9772; 
roughly 98% of all samples will result in correct rejection of Hp. 
Using z, = Zg = 2.33 in the sample size formula yields 


aes Pavone + 2.33V (.8)(.2) 
8 — 9 


2 
| =~ 266 a 


Small-Sample Tests 


Test procedures when the sample size n is small are based directly on the binomial 
distribution rather than the normal approximation. Consider the alternative hypoth- 
esis H,,: p > py and again let X be the number of successes in the sample. Then X is 
the test statistic. When H) is true, X has a binomial distribution with parameters n 
and pp, So 


P-value = P(X = x when H) is true) 
= P(X = x when X ~ Bin(n, pp)) 
=1—- P(X =x -— | when X ~ Bin(n, pp)) 
=1- Ba - 1;n, po) 
Because X has a discrete probability distribution, it is usually not possible to obtain 


a test for which P(type I error) is exactly the desired significance level @ (e.g., .05 
or .01; refer back to middle of page 323 for an example). 
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Let p’ denote an alternative value of p(p’ > po). When p = p’, X ~ Bin(n, p’). 
The probability of a type II error is then calculated by expressing the condition 
P-value > a in the equivalent form x < c,. Then 


B(p') = P(type I error when p = p’) 
= P(X <c, when X ~ Bin(n, p’)) = B(c, — 1; n, p') 


That is, B(p’) is the result of a straightforward binomial probability calculation. 
The sample size n necessary to ensure that a level a@ test also has specified B at 
a particular alternative value p’ must be determined by trial and error using the 
binomial cdf. 

Test procedures for H,: p < py and for H,: p # po are constructed in a similar 
manner. In the former case, the P-value is B(x; n, po). The P-value when the alterna- 
tive hypothesis is H,: p # po is twice the smaller of the two probabilities B(x; n, po) 
and 1 — B(x — 1; 1n, po). 


EXAMPLE 8.15 A plastics manufacturer has developed a new type of plastic trash can and proposes to 
sell them with an unconditional 6-year warranty. To see whether this is economically 
feasible, 20 prototype cans are subjected to an accelerated life test to simulate 6 years 
of use. The proposed warranty will be modified only if the sample data strongly sug- 
gests that fewer than 90% of such cans would survive the 6-year period. Let p denote 
the proportion of all cans that survive the accelerated test. The relevant hypotheses 
are H,: p = .9 versus H,: p < .9. A decision will be based on the test statistic X, the 
number among the 20 that survive. Because of the inequality in H,, any value smaller 
than the observed value x is more contradictory to H, than is x itself. Therefore 


P-value = P(X = x when A) is true) = B(x; 20, .9) 


From Appendix Table A.1, B(15; 20, .9) = .043, whereas B(16; 20, .9) = .133. The 
closest achievable significance level to .05 is therefore .043. Since B(14; 20, .9) = 
011, Hy would be rejected at this significance level if the accelerated test results 
in x = 14. It would then be appropriate to modify the proposed warranty. Because 
P-value = .043 is equivalent to x = 15, the probability of a type II error for the 
alternative value p’ = .8 is 


B(.8) = P(A, is not rejected when X ~ Bin(20, .8)) 
= P(X = 16 when X ~ Bin(20, .8)) 
= 1 — BCS; 20, .8) = 1 — .370 = .630 
That is, when p = .8, 63% of all samples consisting of n = 20 cans would result in 


H, being incorrectly not rejected. This error probability is high because 20 is a small 
sample size and p’ = .8 is close to the null value p, = .9. Bo 


EXERCISES Section 8.4 (42-52) 


42. Consider using a z test to test Hy: p = .6. Determine the 43. A common characterization of obese individuals is that 
P-value in each of the following situations. their body mass index is at least 30 [BMI = weight/(height)?, 
a. H,: p > .6,z = 1.47 where height is in meters and weight is in kilograms]. The 
b. H,: p < .6,z = —2.70 article ‘The Impact of Obesity on IlIness Absence and 
c. H,: p # .6,z = —2.70 Productivity in an Industrial Population of Petrochemical 
d. Hyp <.6,2 = .25 Workers” (Annals of Epidemiology, 2008: 8-14) reported 
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44, 


45. 


46. 


47. 


that in a sample of female workers, 262 had BMIs of less 

than 25, 159 had BMIs that were at least 25 but less than 30, 

and 120 had BMIs exceeding 30. Is there compelling evi- 

dence for concluding that more than 20% of the individuals 

in the sampled population are obese? 

a. State and test appropriate hypotheses with a signifi- 
cance level of .05. 

b. Explain in the context of this scenario what consti- 
tutes type I and II errors. 

c. What is the probability of not concluding that more 
than 20% of the population is obese when the actual 
percentage of obese individuals is 25%? 


A manufacturer of nickel-hydrogen batteries randomly 
selects 100 nickel plates for test cells, cycles them a 
specified number of times, and determines that 14 of the 
plates have blistered. 

a. Does this provide compelling evidence for conclud- 
ing that more than 10% of all plates blister under 
such circumstances? State and test the appropriate 
hypotheses using a significance level of .05. In 
reaching your conclusion, what type of error might 
you have committed? 

b. If it is really the case that 15% of all plates blister 
under these circumstances and a sample size of 100 
is used, how likely is it that the null hypothesis of 
part (a) will not be rejected by the level .05 test? 
Answer this question for a sample size of 200. 

c. How many plates would have to be tested to have 
B(.15) = .10 for the test of part (a)? 


A random sample of 150 recent donations at a certain 
blood bank reveals that 82 were type A blood. Does this 
suggest that the actual percentage of type A donations 
differs from 40%, the percentage of the population hav- 
ing type A blood? Carry out a test of the appropriate 
hypotheses using a significance level of .01. Would your 
conclusion have been different if a significance level of 
.05 had been used? 


It is known that roughly 2/3 of all human beings have a 

dominant right foot or eye. Is there also right-sided domi- 

nance in kissing behavior? The article “Human Behavior: 

Adult Persistence of Head-Turning Asymmetry” 

(Nature, 2003: 771) reported that in a random sample of 

124 kissing couples, both people in 80 of the couples 

tended to lean more to the right than to the left. 

a. If 2/3 of all kissing couples exhibit this right-leaning 
behavior, what is the probability that the number in a 
sample of 124 who do so differs from the expected 
value by at least as much as what was actually 
observed? 

b. Does the result of the experiment suggest that the 2/3 
figure is implausible for kissing behavior? State and 
test the appropriate hypotheses. 


The article “Effects of Bottle Closure Type on 
Consumer Perception of Wine Quality” (Amer. J. of 
Enology and Viticulture, 2007: 182-191) reported that 


48. 


49. 


50. 
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in a sample of 106 wine consumers, 22 (20.8%) thought 

that screw tops were an acceptable substitute for natural 

corks. Suppose a particular winery decided to use screw 

tops for one of its wines unless there was strong evidence 

to suggest that fewer than 25% of wine consumers found 

this acceptable. 

a. Using a significance level of .10, what would you 
recommend to the winery? 

b. For the hypotheses tested in (a), describe in context 
what the type I and II errors would be, and say which 
type of error might have been committed. 


With domestic sources of building supplies running low 

several years ago, roughly 60,000 homes were built with 

imported Chinese drywall. According to the article 

“Report Links Chinese Drywall to Home Problems” 

(New York Times, Nov. 24, 2009), federal investigators 

identified a strong association between chemicals in the 

drywall and electrical problems, and there is also strong 
evidence of respiratory difficulties due to the emission of 

hydrogen sulfide gas. An extensive examination of 51 

homes found that 41 had such problems. Suppose these 

51 were randomly sampled from the population of all 

homes having Chinese drywall. 

a. Does the data provide strong evidence for conclud- 
ing that more than 50% of all homes with Chinese 
drywall have electrical/environmental problems? 
Carry out a test of hypotheses using a = .01. 

b. Calculate a lower confidence bound using a confi- 
dence level of 99% for the percentage of all such 
homes that have electrical/environmental problems. 

c. If it is actually the case that 80% of all such homes 
have problems, how likely is it that the test of (a) 
would not conclude that more than 50% do? 


A plan for an executive travelers’ club has been devel- 

oped by an airline on the premise that 5% of its current 

customers would qualify for membership. A random 

sample of 500 customers yielded 40 who would qualify. 

a. Using this data, test at level .01 the null hypothesis 
that the company’s premise is correct against the 
alternative that it is not correct. 

b. What is the probability that when the test of part (a) 
is used, the company’s premise will be judged correct 
when in fact 10% of all current customers qualify? 


Each of a group of 20 intermediate tennis players is 
given two rackets, one having nylon strings and the other 
synthetic gut strings. After several weeks of playing with 
the two rackets, each player will be asked to state a pref- 
erence for one of the two types of strings. Let p denote 
the proportion of all such players who would prefer gut 
to nylon, and let X be the number of players in the sam- 
ple who prefer gut. Because gut strings are more expen- 
sive, consider the null hypothesis that at most 50% of all 
such players prefer gut. We simplify this to H): p = .5, 
planning to reject H, only if sample evidence strongly 
favors gut strings. 
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a. Is a significance level of exactly .05 achievable? If 
not, what is the largest a smaller than .05 that is 
achievable? 

b. If 60% of all enthusiasts prefer gut, calculate the 
probability of a type II error using the significance 
level from part (a). Repeat if 80% of all enthusiasts 
prefer gut. 

c. If 13 out of the 20 players prefer gut, should Hy be 
rejected using the significance level of (a)? 


51. A manufacturer of plumbing fixtures has developed a 


new type of washerless faucet. Let p = P(a randomly 
selected faucet of this type will develop a leak within 2 
years under normal use). The manufacturer has decided to 
proceed with production unless it can be determined that 
p is too large; the borderline acceptable value of p is 
specified as .10. The manufacturer decides to subject n of 
these faucets to accelerated testing (approximating 


52. 


2 years of normal use). With X¥ = the number among the 
n faucets that leak before the test concludes, production 
will commence unless the observed X is too large. It is 
decided that if p = .10, the probability of not proceeding 
should be at most .10, whereas if p = .30 the probability 
of proceeding should be at most .10. Can n = 10 be used? 
n = 20? n = 25? What are the actual error probabilities 
for the chosen n? 


In a sample of 171 students at an Australian university 
that introduced the use of plagiarism-detection software 
in a number of courses, 58 students indicated a belief that 
such software unfairly targets students (“Student and 
Staff Perceptions of the Effectiveness of Plagiarism 
Detection Software,’ Australian J. of Educ. Tech., 
2008: 222-240). Does this suggest that a majority of 
students at the university do not share this belief? Test 
appropriate hypotheses. 


8.5 Further Aspects of Hypothesis Testing 


We close this introductory chapter on hypothesis testing by briefly considering 
a variety of issues involving the use of test procedures: the distinction between 
statistical and practical significance, the relationship between tests and confidence 
intervals, the implications of multiple testing, and a general method for deriving test 
statistics. 


Statistical Versus Practical Significance 


Statistical significance means simply that the null hypothesis was rejected at the 
selected significance level. That is, in the judgment of the investigator, any observed 
discrepancy between the data and what would be expected were H) true cannot 
be explained solely by chance variation. However, a small P-value, which would 
ordinarily indicate statistical significance, may be the result of a large sample size 
in combination with a departure from H, that has little practical significance. In 
many experimental situations, only departures from H, of large magnitude would be 
worthy of detection, whereas a small departure from H) would have little practical 
significance. 

As an example, let 4 denote the true average IQ of all children in the very 
large city of Euphoria. Consider testing Hy: w = 100 versus H,: uw > 100 assum- 
ing a normal IQ distribution with o = 15 (100 is conventionally believed to be the 
average IQ for all individuals, so parents of Euphorian children might be euphoric to 
have the null hypothesis rejected). But one IQ point is no big deal, so the value uw = 
101 certainly does not represent a departure from H, that has practical significance. 
For a reasonably large sample size n, this 2 would lead to an x value near 101, so 
we would not want this sample evidence to argue strongly for rejection of Hy) when 
x = 101 is observed. For various sample sizes, Table 8.1 records both the P-value 
when x = 101 and also the probability of not rejecting H, at level .01 when w = 101. 

The second column in Table 8.1 shows that even for moderately large sample 
sizes, the P-value resulting from x = 101 argues very strongly for rejection of H,, 
whereas the observed x itself suggests that in practical terms the true value of w differs 
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Table 8.1 An Illustration of the Effect of Sample Size on P-values and B 


n P-Value When x = 101 £(101) for Level .01 Test 
25 3707 9772 
100 2514 9525 
400 0918 8413 
900 0228 .6293 
1600 .0038 3707 
2500 .0004 1587 
5000 .0000012 .0087 

10,000 .0000000 .0000075 


little from the null value jz) = 100. The third column points out that even when there 
is little practical difference between the true yw and the null value, for a fixed level of 
significance a large sample size will frequently lead to rejection of the null hypothesis 
at that level. To summarize, one must be especially careful in interpreting evidence 
when the sample size is large, since any small departure from H, will almost surely 
be detected by a test, yet such a departure may have little practical significance. 


The Relationship between Confidence 
Intervals and Hypothesis Tests 


Suppose the standardized variable Z = (6 — 0)/ Gy has (at least approximately) a 
standard normal distribution. The central z curve area captured between — 1.96 and 
1.96 is .95 (and the remaining area .05 is split equally between the two tails, giving 
area .025 in each one). This implies that a confidence interval for 6 with confidence 
level 95% is 6 + 1.9665. 

Now consider testing Hj: 6 = 6, versus H,: 6 # 0, at significance level .05 
using the test statistic Z = (6 - 6))/Gs. The phrase “z test” implies that when the 
null hypothesis is true, Z has (at least approximately) a standard normal distribu- 
tion. So the P-value will be twice the area under the z curve to the right of Izl. This 
P-value will be less than or equal to .05, allowing for rejection of the null hypothesis, 
if and only if either z = 1.96 or z = —1.96. The null hypothesis will therefore not 
be rejected if —1.96 < z < 1.96. 

Substituting the formula for z into this latter system of inequalities and manipu- 
lating them to isolate 6, gives the equivalent system @ — 1.966, <6) < 6+ 1.9665. 
The lower limit in this system is just the left endpoint of the 95% confidence interval, 
and the upper limit is the right endpoint of the interval. What this says is that the null 
hypothesis will not be rejected if and only if the null value 6, lies in the confidence 
interval. Suppose, for example, that sample data yields the 95% CI (68.6, 72.0). 
Then the null hypothesis Hy: 6 = 70 cannot be rejected at significance level .05 
because 70 lies in the CI. But the null hypothesis H,: 6 = 65 can be rejected because 
65 does not lie in the CI. There is an analogous relationship between a 99% CI and a 
test with significance level .01—the null hypothesis cannot be rejected if the null 
value lies in the CI and should be rejected if the null value is outside the CI. There is 
a duality between a two-sided confidence interval with confidence level 100(1 — a)% 
and the conclusion from a two-tailed test with significance level a. 

Now consider testing H,: 6 = 6) against the alternative H,: 0 > 0 at sig- 
nificance level .01. Because of the inequality in H,, the P-value is the area under the 
z curve to the right of the calculated z. The z critical value 2.33 captures upper-tail area 
.O1. Therefore the P-value (captured upper-tail area) will be at most .01 if and only if 
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z = 2.33; we will not be able to reject the null hypothesis if and only if z < 2.33. Again 
substituting the formula for z into this inequality and manipulating to isolate 6) gives 
the equivalent inequality 6 - 2.3364 < @. The lower limit of this inequality is the 
lower confidence bound for @ with a confidence level of 99%. So the null hypothesis 
won’t be rejected at significance level .01 if and only if the null value exceeds the lower 
confidence bound. Thus there is a duality between a lower confidence bound and the 
conclusion from an upper-tailed test. This is why the Minitab software package will 
output a lower confidence bound when an upper-tailed test is performed. If, for exam- 
ple, the 90% lower confidence bound is 25.3, i.e., 25.3 < 6 with confidence level 90%, 
then we would not be able to reject H,: 6 = 26 versus H,: 0 > 26 at significance level 
.10 but would be able to reject Hp: 6 = 24 in favor of H,: 6 > 24. There is an analogous 
duality between an upper confidence bound and the conclusion from a lower-tailed test. 
And there are analogous relationships for ¢ tests and t confidence intervals or bounds. 


PROPOSITION Let (6 i 6y) be a confidence interval for 6 with confidence level 100(1 — a)%. 
Then a test of Hy: 8 = 0) versus H,: 0 # 6, with significance level a rejects 
the null hypothesis if the null value ) is not included in the CI and does not 
reject H) if the null value does lie in the CI. There is an analogous relationship 
between a lower confidence bound and an upper-tailed test, and also between 
an upper confidence bound and a lower-tailed test. 


In light of these relationships, it is tempting to carry out a test of hypotheses by cal- 
culating the corresponding CI or CB. Don’t yield to temptation! Instead carry out a 
more informative analysis by determining and reporting the P-value. 


Simultaneous Testing of Several 
Hypotheses 


Many published articles report the results of more than just a single test of hypoth- 
eses. For example, the article “Distributions of Compressive Strength Obtained 
from Various Diameter Cores” (ACI Materials J., 2012: 597-606) considered the 
plausibility of Weibull, normal, and lognormal distributions as models for compres- 
sive strength distributions under various experimental conditions. Table 3 of the 
cited article reported exact P-values for a total of 71 different tests. 

Consider two different tests, one for a pair of hypotheses about a population 
mean and another for a pair of hypotheses about a population proportion—e.g., the 
mean wing length for adult Monarch butterflies and the proportion of schoolchildren 
in a particular state who are obese. Assume that the sample used to test the first pair 
of hypotheses is selected independently of that used to test the second pair. Then if 
each test is carried out at significance level .05 (type I error probability .05), 


P(at least one type I error is committed) = 1 — P(no type I errors are committed) 
= | — P(no type | error in the Ist test) - P(no type I error in the 2nd test) 
= 1—(.95)? = 1 — .9025 = .0975 


Thus the probability of committing at least one type I error when two independent tests 
are carried out is much higher than the probability that a type I error will result from 
a single test. If three tests are independently carried out, each at significance level .05, 
then the probability that at least one type I error is committed is 1 — (.95)? = .1426. 
Clearly as the number of tests increases, the probability of committing at least one type 
I error gets larger and in fact will approach 1. 
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Suppose we want the probability of committing at least one type I error in two 
independent tests to be .05—an experimentwise error rate of .05. Then the signifi- 
cance level a for each test must be smaller than .05: 


05=1-(U-afS>1-a=V95 = 975 Sa = .025 


If the probability of committing at least one type I error in three independent tests is 
to be .05, the significance level for each one must be .017 (replace the square root 
by the cube root in the foregoing argument). As the number of tests increases, the 
significance level for each one must decrease to 0 in order to maintain an experi- 
mentwise error rate of .05. 

Often it is not reasonable to assume that the various tests are independent of 
one another. In the example cited at the beginning of this subsection, four different 
tests were carried out based on the same sample involving one particular type of 
concrete in combination with a specified core diameter and length-to-diameter ratio. 
It is then no longer clear how the experimentwise error rate relates to the significance 
level for each individual test. Let A; denote the event that the ith test results in a type 
I error. Then in the case of k tests, 


P(at least one type I error) 
= P(A, UA, U... UA, = P(A,) + +++ + P(A) = ka 


(the inequality in the last line is called the Bonferroni inequality; it can be proved by 
induction on k). Thus a significance level of .05/k for each test will ensure that the 
experimentwise significance level is at most .05. 

Again, the central idea here is that in order for the probability of at least one 
type I error among k tests to be small, the significance level for each individual test 
must be quite small. If the significance level for each individual test is .05, for even 
a moderate number of tests it is rather likely that at least one type I error will be 
committed. That is, with a = .05 for each test, when each null hypothesis is actually 
true, it is rather likely that at least one of the tests will yield a statistically significant 
result. This is why one should view a statistically significant result with skepticism 
when many tests are carried out using one of the traditional significance levels. 


The Likelihood Ratio Principle 


The test procedures presented in this and subsequent chapters will (at least for the 
most part) be intuitively sensible. But there are many situations that arise in practice 
where intuition is not a reliable guide to obtaining a test statistic. We now describe 
a general strategy for this purpose. Let x,, x,,..., x,, be the observations in a random 
sample of size n from a probability distribution f(x; 0). The joint distribution evalu- 
ated at these sample values is the product f(x,; 0) - f(x; 0) > +--+ f(x, 6). As in the 
discussion of maximum likelihood estimation, the likelihood function is this joint dis- 
tribution, regarded as a function of 6. Consider testing Hp: 0 is in Q) versus H,: 0 is 
in Q,, where Q, and ©, are disjoint (for example, H,: 6 = 100 versus H,: 8 > 100). 
The likelihood ratio principle for test construction proceeds as follows: 


1. Find the largest value of the likelihood for any @ in ), (by finding the maxi- 
mum likelihood estimate within Q, and substituting back into the likelihood 
function). 

2. Find the largest value of the likelihood for any @ in ©. 


3. Form the ratio 
maximum likelihood for @ in 0, 


Xr presp ~ H 1 i i 
co XW) maximum likelihood for 6 in 0, 
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The ratio A(x,,...,x,) 18 called the likelihood ratio statistic value. Intuitively, the 
smaller the value of A, the stronger is the evidence against H). It can, for example, 
be shown that for testing Hj: w S my, versus H,: sw > My in the case of population 
normality, a small value of A is equivalent to a large value of ¢. Thus the one-sample 
t test comes from applying the likelihood ratio principle. We emphasize that once a 
test statistic has been selected, its distribution when Hp is true is required for P-value 
determination; statistical theory must again come to the rescue! 

The likelihood ratio principle can also be applied when the X,’s have different dis- 
tributions and even when they are dependent, though the likelihood function can be com- 
plicated in such cases. Many of the test procedures to be presented in subsequent chapters 
are obtained from the likelihood ratio principle. These tests often turn out to minimize B 
among all tests that have the desired a, so are truly best tests. For more details and some 
worked examples, refer to one of the references listed in the Chapter 6 bibliography. 

A practical limitation is that, to construct the likelihood ratio test statistic, the 
form of the probability distribution from which the sample comes must be speci- 
fied. Derivation of the t test from the likelihood ratio principle requires assuming a 
normal pdf. If an investigator is willing to assume that the distribution is symmetric 
but does not want to be specific about its exact form (such as normal, uniform, or 
Cauchy), then the principle fails because there is no way to write a joint pdf simulta- 
neously valid for all symmetric distributions. In Chapter 15, we will present several 
distribution-free test procedures, so called because the probability of a type I error 
is controlled simultaneously for many different underlying distributions. These pro- 
cedures are useful when the investigator has limited knowledge of the underlying 
distribution. We shall also consider criteria for selection of a test procedure when 
several sensible candidates are available, and comment on the performance of sev- 
eral procedures when an underlying assumption such as normality is violated. 


EXERCISES Section 8.5 (53-56) 


53. Reconsider the paint-drying problem discussed in b. For p=x/n=.21, compute the P-value when 
Example 8.5. The hypotheses were H): w = 75 versus n = 100, 2500, 10,000, and 40,000. 
H,: w < 75, with o assumed to have value 9.0. Consider c. In most situations, would it be reasonable to use a 
the alternative value pw = 74, which in the context of the level .01 test in conjunction with a sample size of 
problem would presumably not be a practically signifi- 40,000? Why or why not? 


Gant depnturee pm te 55. Consider carrying out m tests of hypotheses based on inde- 


a. For a level .01 test, compute B at this alternative for pendent samples, each at significance level (exactly) .01. 


sample sizes n = 100, 900, and 2500. 

b. If the observed value of X is x = 74, what can you 
say about the resulting P-value when n = 2500? Is 
the data statistically significant at any of the stan- 
dard values of a? 


a. What is the probability of committing at least one 
type I error when m = 5? When m = 10? 

b. How many such tests would it take for the probability 
of committing at least one type I error to be at least .5? 


c. Would you really want to use a sample size of 2500 56. A 95% CI for true average amount of warpage (mm) of 
along with a level .01 test (disregarding the cost of laminate sheets under specified conditions was calculated 
such an experiment)? Explain. as (1.81, 1.95), based on a sample size of n = 15 and the 


tion that t of i Ily distributed. 
54. Consider the large-sample level .01 test in Section 8.4 for (isaae tae OL DUI aa ria aneriaaaias semen ies 


testing Hy: p = .2 against H,: p > .2. a. Suppose you want to test Hy: w = 2 versus H,: w # 


2 using a = .05. What conclusion would be appro- 


a. For the alternative value p = .21, compute B(.21) for ptiate, atid why? 


sample sizes n = 100, 2500, 10,000, 40,000, and 


90,000. b. If you wanted to use a significance level of .01 for the 


test in (a), what conclusion would be appropriate? 
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SUPPLEMENTARY EXERCISES (57-80) 


57. A sample of 50 lenses used in eyeglasses yields a sample standard error of the mean of 5.26. It was also stated 
mean thickness of 3.05 mm and a sample standard devia- that the China background value for this concentration 
tion of .34 mm. The desired true average thickness of was 20. The results of various statistical tests described 
such lenses is 3.20 mm. Does the data strongly suggest in the article were predicated on assuming normality. 
that the true average thickness of such lenses is some- a. Does the data provide strong evidence for conclud- 
thing other than what is desired? Test using a = .05. ing that the true average concentration in the sam- 

58. In Exercise 57, suppose the experimenter had believed pled region exceeds the stated background value? 
before collecting the data that the value of o was approx- Carry out a test at significance level .01. Does the 
imately .30. If the experimenter wished the probability of result surprise you? Explain. 

a type II error to be .05 when pw = 3.00, was a sample b. Referring back to the test of (a), how likely is it that 
size 50 unnecessarily large? the P-value would be at least .01 when the true aver- 
: ‘ ; . age concentration is 50 and the true standard devia- 
59. It is specified that a certain type of iron should con- 


Variable N 
sil cont 25 0.8880 


tain .85 g of silicon per 100 g of iron (.85%). The silicon 
content of each of 25 randomly selected iron specimens 
was determined, and the accompanying Minitab output 
resulted from a test of the appropriate hypotheses. 


StDev SE Mean T P 
0.1807 0.0361 1.05 0.30 


Mean 


a. What hypotheses were tested? 

b. What conclusion would be reached for a significance 
level of .05, and why? Answer the same question for 
a significance level of .10. 


62. 


tion of concentration is 10? 


The article “Orchard Floor Management Utilizing 
Soil-Applied Coal Dust for Frost Protection’ (Agri. 
and Forest Meteorology, 1988: 71-82) reports the fol- 
lowing values for soil heat flux of eight plots covered 
with coal dust. 


34.7 35.4 34.7 37.7 32.5 28.0 


The mean soil heat flux for plots covered only with 
grass is 29.0. Assuming that the heat-flux distribution is 
approximately normal, does the data suggest that the coal 


18.4 24.9 


60. One method for straightening wire before coiling it to dust is effective in increasing the mean heat flux over that 
make a spring is called “roller straightening.” The article for grass? Test the appropriate hypotheses using a = .05. 
“The Effect of Roller and Spinner Wire Straightening 63. The article “Caffeine Knowledge, Attitudes, and Con- 
on Coiling Performance and Wire Properties” sumption in Adult Women” (J. of Nutrition Educ., 
(Springs, 1987: 27-28) reports on the tensile properties 1992: 179-184) reports the following summary data on 
of wire. Suppose a sample of 16 wires is selected and daily caffeine consumption for a sample of adult women: 
each is tested to determine tensile strength (N/mm7). The n = 47, x = 215 mg, s = 235 mg, and range = 5—1176. 
resulting sample mean and standard deviation are 2160 a. Does it appear plausible that the population distri- 
and 30, respectively. bution of daily caffeine consumption is normal? Is 
a. The mean tensile strength for springs made using it necessary to assume a normal population distribu- 

spinner straightening is 2150 N/mm?. What hypoth- tion to test hypotheses about the value of the popu- 
eses should be tested to determine whether the mean lation mean consumption? Explain your reasoning. 
tensile strength for the roller method exceeds 2150? b. Suppose it had previously been believed that mean 

b. Assuming that the tensile strength distribution is consumption was at most 200 mg. Does the given 
approximately normal, what test statistic would you data contradict this prior belief? Test the appropriate 
use to test the hypotheses in part (a)? hypotheses at significance level .10. 

c. What is the value of the test statistic for this data? 64. Annual holdings turnover for a mutual fund is the per- 
What is the P-value for the value of the test statistic centage of a fund’s assets that are sold during a particular 
computed in part (c)? year. Generally speaking, a fund with a low value of 

e. Fora level .05 test, what conclusion would you reach? turnover is more stable and risk averse, whereas a high 

61. Contamination of mine soils in China is a serious envi- value of turnover indicates a substantial amount of buying 
ronmental problem. The article “Heavy Metal and selling in an attempt to take advantage of short-term 
Contamination in Soils and Phytoaccumulation in a market fluctuations. Here are values of turnover for a 
Manganese Mine Wasteland, South China” (Air, sample of 20 large-cap blended funds (refer to Exercise 
Soil, and Water Res., 2008: 31-41) reported that, for a 1.53 for a bit more information) extracted from 
sample of 3 soil specimens from a certain restored min- Morningstar.com: 
ing area, the sample mean concentration of Total Cu 1.03 1.23 1.10 1.64 1.30 1.27 1.25 0.78 1.05 0.64 
was 45.31 mg/kg with a corresponding (estimated) 0.94 2.86 1.05 0.75 0.09 0.79 1.61 1.26 0.93 0.84 
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65. 


66. 


67. 


a. Would you use the one-sample ¢ test to decide 
whether there is compelling evidence for concluding 
that the population mean turnover is less than 100%? 
Explain. 

b. A normal probability plot of the 20 In(turnover) val- 
ues shows a very pronounced linear pattern, suggest- 
ing it is reasonable to assume that the turnover distri- 
bution is lognormal. Recall that X has a lognormal 
distribution if In(X) is normally distributed with 
mean value pw and variance o*. Because p is also the 
median of the In(X) distribution, e“ is the median of 
the X distribution. Use this information to decide 
whether there is compelling evidence for concluding 
that the median of the turnover population distribu- 
tion is less than 100%. 


The true average breaking strength of ceramic insulators 
of a certain type is supposed to be at least 10 psi. They 
will be used for a particular application unless sample 
data indicates conclusively that this specification has not 
been met. A test of hypotheses using a = .01 is to be 
based on a random sample of ten insulators. Assume that 
the breaking-strength distribution is normal with unknown 
standard deviation. 

a. If the true standard deviation is .80, how likely is it 
that insulators will be judged satisfactory when true 
average breaking strength is actually only 9.5? 
Only 9.0? 

b. What sample size would be necessary to have a 75% 
chance of detecting that the true average breaking 
strength is 9.5 when the true standard deviation is .80? 


The accompanying observations on residual flame time 
(sec) for strips of treated children’s nightwear were 
given in the article “‘An Introduction to Some Precision 
and Accuracy of Measurement Problems” (J. of 
Testing and Eval., 1982: 132-140). Suppose a true 
average flame time of at most 9.75 had been mandated. 
Does the data suggest that this condition has not been 
met? Carry out an appropriate test after first investigat- 
ing the plausibility of assumptions that underlie your 
method of inference. 


9.85 9.93 9.75 9.77 9.67 9.87 9.67 
9.94 9.85 9.75 9.83 9.92 9.74 9.99 
9.88 9.95 9.95 9.93 9.92 9.89 


The incidence of a certain type of chromosome defect in 

the U.S. adult male population is believed to be 1 in 75. 

A random sample of 800 individuals in U.S. penal insti- 

tutions reveals 16 who have such defects. Can it be con- 

cluded that the incidence rate of this defect among pris- 

oners differs from the presumed rate for the entire adult 

male population? 

a. State and test the relevant hypotheses using a = .05. 
What type of error might you have made in reaching 
a conclusion? 

b. Based on the P-value calculated in (a), could H, be 
rejected at significance level .20? 
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68. 


69. 


70. 


71. 


72. 


In an investigation of the toxin produced by a certain poi- 
sonous snake, a researcher prepared 26 different vials, 
each containing 1 g of the toxin, and then determined the 
amount of antitoxin needed to neutralize the toxin. The 
sample average amount of antitoxin necessary was found 
to be 1.89 mg, and the sample standard deviation was .42. 
Previous research had indicated that the true average 
neutralizing amount was 1.75 mg/g of toxin. Does the 
new data contradict the value suggested by prior 
research? Test the relevant hypotheses. Does the validity 
of your analysis depend on any assumptions about the 
population distribution of neutralizing amount? Explain. 


The sample average unrestrained compressive strength 
for 45 specimens of a particular type of brick was com- 
puted to be 3107 psi, and the sample standard deviation 
was 188. The distribution of unrestrained compressive 
strength may be somewhat skewed. Does the data strong- 
ly indicate that the true average unrestrained compres- 
sive strength is less than the design value of 3200? Test 
using a = .001. 


The Dec. 30, 2009, the New York Times reported that in a 
survey of 948 American adults who said they were at least 
somewhat interested in college football, 597 said the cur- 
rent Bowl Championship System should be replace by a 
playoff similar to that used in college basketball. Does this 
provide compelling evidence for concluding that a majori- 
ty of all such individuals favor replacing the B.C.S. with a 
playoff? Test the appropriate hypotheses using a signifi- 
cant level of .001. 


When X,, X,,..., X, are independent Poisson variables, 
each with parameter yp, and n is large, the sample mean 
X has approximately a normal distribution with uw = E(X) 
and V(X) = p/n. This implies that 


X—p 
V p/n 


has approximately a standard normal distribution. For 
testing Hy: w= fy, we can replace ps by py in the equa- 
tion for Z to obtain a test statistic. This statistic is actually 
preferred to the large-sample statistic with denominator 
S/Vn (when the X;,’s are Poisson) because it is tailored 
explicitly to the Poisson assumption. If the number of 
requests for consulting received by a certain statistician 
during a 5-day work week has a Poisson distribution and 
the total number of consulting requests during a 36-week 
period is 160, does this suggest that the true average num- 
ber of weekly requests exceeds 4.0? Test using a = .02. 


Z= 


An article in the Nov. 11, 2005, issue of the San Luis 
Obispo Tribune reported that researchers making random 
purchases at California Wal-Mart stores found scanners 
coming up with the wrong price 8.3% of the time. 
Suppose this was based on 200 purchases. The National 
Institute for Standards and Technology says that in the 
long run at most two out of every 100 items should have 
incorrectly scanned prices. 


73. 


74, 


75. 


76. 


a. Develop a test procedure with a significance level of 
(approximately) .05, and then carry out the test to 
decide whether the NIST benchmark is not satisfied. 

b. For the test procedure you employed in (a), what is 
the probability of deciding that the NIST benchmark 
has been satisfied when in fact the mistake rate is 5%? 


The article “Heavy Drinking and Polydrug Use Among 
College Students” (J. of Drug Issues, 2008: 445-466) 
stated that 51 of the 462 college students in a sample had 
a lifetime abstinence from alcohol. Does this provide 
strong evidence for concluding that more than 10% of the 
population sampled had completely abstained from alco- 
hol use? Test the appropriate hypotheses. [Note: The arti- 
cle used more advanced statistical methods to study the 
use of various drugs among students characterized as 
light, moderate, and heavy drinkers.] 


The article “Analysis of Reserve and Regular Bottlings: 
Why Pay for a Difference Only the Critics Claim to 
Notice?” (Chance, Summer 2005, pp. 9-15) reported on 
an experiment to investigate whether wine tasters could 
distinguish between more expensive reserve wines and 
their regular counterparts. Wine was presented to tasters 
in four containers labeled A, B, C, and D, with two of 
these containing the reserve wine and the other two the 
regular wine. Each taster randomly selected three of the 
containers, tasted the selected wines, and indicated which 
of the three he/she believed was different from the other 
two. Of the n = 855 tasting trials, 346 resulted in correct 
distinctions (either the one reserve that differed from the 
two regular wines or the one regular wine that differed 
from the two reserves). Does this provide compelling 
evidence for concluding that tasters of this type have some 
ability to distinguish between reserve and regular wines? 
State and test the relevant hypotheses. Are you particu- 
larly impressed with the ability of tasters to distinguish 
between the two types of wine? 


The American Academy of Pediatrics recommends a 
vitamin D level of at least 20 ng/ml for infants. The 
article “Vitamin D and Parathormone Levels of Late- 
Preterm Formula Fed Infants During the First Year 
of Life” (European J. of Clinical Nutr., 2012: 224-230) 
reported that for a sample of 102 preterm infants judged 
to be of appropriate weight for their gestational age, the 
sample mean vitamin D level at 2 weeks was 21 with a 
sample standard deviation of 11. Does this provide con- 
vincing evidence that the population mean vitamin D 
level for such infants exceeds 20? Test the relevant 
hypotheses using a significance level of .10. 


Chapter 7 presented a CI for the variance o” of a normal 
population distribution. The key result there was that the 
rv x? = (n — 1)S?/o* has a chi-squared distribution with 
n—1 df. Consider the null hypothesis Hp: 0? = 0 
(equivalently, o = oy). Then when Hy is true, the test 
statistic x? = (n — 1)S*/o; has a chi-squared distribution 
with n — 1 df. If the relevant alternative is H,: 0? > 0% 


77. 


78. 


79. 
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the P-value is the area under the y? curve with n — 1| df 
to the right of the calculated x” value. To ensure reason- 
ably uniform characteristics for a particular application, 
it is desired that the true standard deviation of the soften- 
ing point of a certain type of petroleum pitch be at most 
50°C. The softening points of ten different specimens 
were determined, yielding a sample standard deviation of 
58°C. Does this strongly contradict the uniformity 
specification? Test the appropriate hypotheses using 
a = 01. [Hint: Consult Table A.11.] 


Referring to Exercise 76, suppose an investigator wishes 
to test Hy: 0? = .04 versus H,: 0? <.04 based on a 
sample of 21 observations. The computed value of 
20s7/.04 is 8.58. Place bounds on the P-value and then 
reach a conclusion at level .01. [Hint: Consult Table A.7.] 


When the population distribution is normal and n is 
large, the sample standard deviation S has approximately 
anormal distribution with E(S) ~ o and V(S) ~ o7/(2n). 

We already know that in this case, for any n, X is normal 

with E(X) = pw and V(X) = o2/n. 

a. Assuming that the underlying distribution is normal, 
what is an approximately unbiased estimator of the 
99th percentile 0 = w + 2.330? 

b. When the X,’s are normal, it can be shown that X and 
S are independent rv’s (one measures location 
whereas the other measures spread). Use this to com- 
pute v6) and o¢ for the estimator 6 of part (a). What 
is the estimated standard error 63? 

c. Write a test statistic for testing H,: 0 = 0, that has 
approximately a standard normal distribution when 
Hi is true. If soil pH is normally distributed in a 
certain region and 64 soil samples _ yield 
x = 6.33, s = .16, does this provide strong evidence 
for concluding that at most 99% of all possible 
samples would have a pH of less than 6.75? Test 
using a = .01. 


Let X,, X,,..., X,, be a random sample from an exponential 

distribution with parameter A. Then it can be shown that 

2A2X;, has a chi-squared distribution with v = 2n (by first 
showing that 2AX, has a chi-squared distribution with 

v= 2). 

a. Use this fact to obtain a test statistic for testing 
Hy: &@ = bo. Then explain how you would deter- 
mine the P-value when the alternative hypothesis is 
H,: b@ < po. (Hint: E(X) = w = 1/A, so w = po is 
equivalent to A = 1/pLy.] 

b. Suppose that ten identical components, each having 
exponentially distributed time until failure, are 
tested. The resulting failure times are 


95 16 11 3 42 71 225 64 87 123 


Use the test procedure of part (a) to decide whether 
the data strongly suggests that the true average life- 
time is less than the previously claimed value of 75. 
[Hint: Consult Table A.7.] 
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80. Because of variability in the manufacturing process, the a. If 15 of 60 specimens yield before the theoretical 
actual yielding point of a sample of mild steel subjected point, what is the P-value when the appropriate test is 
to increasing stress will usually differ from the theoreti- used, and what would you advise the company to do? 
cal yielding point. Let p denote the true proportion of b. If the true percentage of “early yields” is actually 
samples that yield before their theoretical yielding point. 50% (so that the theoretical point is the median of the 
If on the basis of a sample it can be concluded that more yield distribution) and a level .01 test is used, what is 
than 20% of all specimens yield before the theoretical the probability that the company concludes a modifi- 
point, the production process will have to be modified. cation of the process is necessary? 


BIBLIOGRAPHY 


See the bibliographies at the ends of Chapter 6 and Chapter 7. 
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Inferences Based 


on Two Samples 


INTRODUCTION 


Chapters 7 and 8 presented confidence intervals (Cl’s) and hypothesis-testing 
procedures for a single mean yp, single proportion p, and a single variance o?. 
Here we extend these methods to situations involving the means, proportions, 
and variances of two different population distributions. For example, let py, 
denote true average Rockwell hardness for heat-treated steel specimens and 
ft) denote true average hardness for cold-rolled specimens. Then an investi- 
gator might wish to use samples of hardness observations from each type of 
steel as a basis for calculating an interval estimate of ~,; — m», the difference 
between the two true average hardnesses. As another example, let p, denote 
the true proportion of nickel-cadmium cells produced under current operating 
conditions that are defective because of internal shorts, and let p, represent the 
true proportion of cells with internal shorts produced under modified operating 
conditions. If the rationale for the modified conditions is to reduce the propor- 
tion of defective cells, a quality engineer would want to use sample information 
to test the null hypothesis H,: 9, — p> = 0 (i.e., Pp, = p>) versus the alternative 
hypothesis, H,: P,; — P> > 0 (i.€., P; > Po). 

Section 9.1 presents z intervals and tests for making inferences about a differ- 
ence between two population means (i.e., procedures developed by starting with 
a standardized variable that has at least approximately a standard normal distribu- 
tion). Two-sample t procedures for making inferences about , — py are the focus 
of Section 9.2. The validity of methods described in the first two sections depends 
on selecting samples from the two populations independently of one another. Often 
in practice data is gathered in pairs. For example, a sample of individuals might be 
selected, a measurement of some sort made before a treatment is applied, and then 
another measurement subsequent to application of the treatment. The analysis of 
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such paired data is described in Section 9.3. Section 9.4 considers inferences about 
a difference between two population proportions, and Section 9.5 does the same 
thing for a ratio of population variances or standard deviations. 


9.1 zTests and Confidence Intervals for a Difference 


Between Two Population Means 


The inferences discussed in this section concern a difference 4, — fL, between the 
means of two different population distributions. An investigator might, for example, 
wish to test hypotheses about the difference between true average breaking strengths 
of two different types of corrugated fiberboard. One such hypothesis would state that 
My, — Py = 0 that is, that w, = p,. Alternatively, it may be appropriate to estimate 
[Ly — by by computing a 95% CI. Such inferences necessitate obtaining a sample of 
strength observations for each type of fiberboard. 


Basic Assumptions 


1. X,, X5,..., X,, is a random sample from a distribution with mean p, and 
variance 074. 
2. Y,, Y,,..., Y,, 18 a random sample from a distribution with mean yw, and 


variance 05. 


3. The X and Y samples are independent of one another. 


The use of m for the number of observations in the first sample and n for the number 
of observations in the second sample allows for the two sample sizes to be different. 
Sometimes this is because it is more difficult or expensive to sample one popula- 
tion than another. In other situations, equal sample sizes may initially be specified, 
but for reasons beyond the scope of the experiment, the actual sample sizes may 
differ. For example, the abstract of the article ‘A Randomized Controlled Trial 
Assessing the Effectiveness of Professional Oral Care by Dental Hygienists” 
Untl. J. of Dental Hygiene, 2008: 63-67) states that “Forty patients were ran- 
domly assigned to either the POC group (m = 20) or the control group (n = 20). 
One patient in the POC group and three in the control group dropped out because 
of exacerbation of underlying disease or death.” The data analysis was then based 
on m = 19 andn = 16. 

The natural estimator of 4, — pg is X — Y, the difference between the corre- 
sponding sample means. Inferential procedures are based on standardizing this estima- 
tor, so we need expressions for the expected value and standard deviation of X — Y. 


PROPOSITION The expected value of X — Y is p, — My, SO X — Y is an unbiased estimator of 
4; — My. The standard deviation of X — Y is 
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Proof Both these results depend on the rules of expected value and variance 
presented in Chapter 5. Since the expected value of a difference is the difference of 
expected values, 


E(® — Y) = BX) — BY) = p, — my 


Because the X and Y samples are independent, X and Y are independent quantities. 
Then the variance of the difference is the sum of V(X) and V(Y): 


oT | % 
Vix — Y) = VX)+ VY) = — + 
moon 
The standard deviation of X — Y is the square root of this expression. a 


If we regard wz, — py aS a parameter 0, then its estimator is 6 =X —Y with 
standard deviation og given by the proposition. When oj and 03 both have known 
values, the value of this standard deviation can be calculated. The sample variances 
must be used to estimate 7, when oj and 03 are unknown. 


Test Procedures for Normal Populations 
with Known Variances 


In Chapters 7 and 8, the first CI and test procedure for a population mean pu were 
based on the assumption that the population distribution was normal with the value 
of the population variance 0? known to the investigator. Similarly, we first assume 
here that both population distributions are normal and that the values of both a7 and 
o3 are known. Situations in which one or both of these assumptions can be dispensed 
with will be presented shortly. 

Because the population distributions are normal, both X and Y have normal 
distributions. Furthermore, independence of the two samples implies that the 
two sample means are independent of one another. Thus the difference X — Y is 
normally distributed, with expected value m1, — fy and standard deviation oy_y 
given in the foregoing proposition. Standardizing X — Y gives the standard normal 
variable 


oor =,= 
Z= (]y — bo) (9.1) 
7  % 


m n 


In a hypothesis-testing problem, the null hypothesis will state that uw, — pL, 
has a specified value. Denoting this null value by Ap, we have Hp: fb, — by = Ao. 
Often Ay) = 0, in which case Hy says that w, = pW. If w, represents the true average 
fuel efficiency (mpg) for automobiles of a certain type equipped with a six-cylinder 
engine and jy, denotes true average efficiency for automobiles of the same type 
equipped with a four-cylinder engine, a sensible null hypothesis of interest might be 
Hy: 4; — by = —3. This is a fancy way of saying that on average the fuel efficiency 
for four-cylinder engines is 3 mpg higher than it is for six-cylinder engines. 

Consider the alternative hypothesis H,: uw, — b@, > Ap. A value x — y that 
considerably exceeds A, (the expected value of X — Y when H, is true) provides evi- 
dence against H, and for H,. Such a value of x — y corresponds to a positive and 
large value of the test statistic. This implies that if the calculated sample means and 
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sample sizes are substituted into the formula for Z and the resulting value is z, then 
values more contradictory to H than z itself are those larger than z. Thus 


P-value = P(obtaining a test statistic value at least 
as contradictory to z when H) is true) 


= P(a standard normal rv is = z) 


= the area under the standard normal 
curve to the right of z 


= =) 


The test procedure in this case is upper-tailed because the P-value is an upper-tail 
z curve area. 

When the alternative hypothesis contains the inequality <, test statistic values 
more contradictory to H, than z itself are those smaller than z. The P-value is then the 
area under the standard normal curve to the left of z; the test is Jower-tailed. Lastly, if 
the inequality # appears in H,, then values either larger than |z| or smaller than —Iz| are 
more contradictory to H, than z itself (the absolute value around z takes care of both the 
Z positive case and the z negative case). The implication is that the P-value is the sum 
of the area under the standard normal curve to the left of —|z| and the area to the right 
of |z|—that is, a two-tailed test. This sum of two tail areas is the same as doubling the 
captured tail area. 


Null hypothesis: Hy: 4, — @, = Ay 
pe ae Ay 
Test statistic value: z = 
Gi Or 
— +e —— 
Ww Ip 
Alternative Hypothesis P-Value Determination 
JELE Vi = fig > IN, Area under the standard normal curve to the 
right of z 
IELE [Ui — fe & MAY, Area under the standard normal curve to the 
left of z 
1ELE (iy, =" [lg B= IN, 2 - (Area under the standard normal curve to 
the right of Izl) 
Assumptions: Two normal population distributions with known values of o, 
and o,, two independent random samples. 


EXAMPLE 9.1 Analysis of a random sample consisting of m = 20 specimens of cold-rolled steel to 
determine yield strengths resulted in a sample average strength of x = 29.8 ksi. A 
second random sample of n = 25 two-sided galvanized steel specimens gave a 
sample average strength of y = 34.7 ksi. Assuming that the two yield-strength dis- 
tributions are normal with 0, = 4.0 and a, = 5.0 (suggested by a graph in the article 
“Zinc-Coated Sheet Steel: An Overview,’ Automotive Engr., Dec. 1984: 39-43), 
does the data indicate that the corresponding true average yield strengths jz, and p, 
are different? Let’s carry out a test at significance level a = .01. 


1. The parameter of interest is 4, — m5, the difference between the true average 
strengths for the two types of steel. 
2. The null hypothesis is Hp: w, — wb, = 0. 
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3. The alternative hypothesis is H,: w, — , # 0; if H, is true, then yw, and pw, are 
different. 


4. With A, = 0, the test statistic value is 


E-y 
Z= ; : 
Big ee 
m n 


5. Substituting m = 20, x = 29.8, of = 16.0, n = 25, y = 34.7, and 05 = 25.0 into 
the formula for z yields 


29.8 — 34.7 —4,.90 
z= a = —3.66 


16.0 , 25.0 1.34 
20° 25 


That is, the observed value of x — y is more than 3 standard deviations below 
what would be expected were H, true. 


6. The ¥ inequality in H, implies that a two-tailed test is appropriate. The P-value is 
2[1 — &(3.66)] ~ 2(0) = 0 (software gives .00025). 


7. Since P-value ~ 0 = .01 = a, Apis therefore rejected at level .01 in favor of the 
conclusion that 4, ~ m,. In fact, with a P-value this small, the null hypothesis 
would be rejected at any sensible significance level. The sample data strongly 
suggests that the true average yield strength for cold-rolled steel differs from that 
for galvanized steel. a 


Using a Comparison to Identify Causality 


Investigators are often interested in comparing either the effects of two different treat- 
ments on a response or the response after treatment with the response after no treat- 
ment (treatment vs. control). If the individuals or objects to be used in the comparison 
are not assigned by the investigators to the two different conditions, the study is said 
to be observational. The difficulty with drawing conclusions based on an observa- 
tional study is that although statistical analysis may indicate a significant difference 
in response between the two groups, the difference may be due to some underlying 
factors that had not been controlled rather than to any difference in treatments. 


EXAMPLE 9.2 A letter in the Journal of the American Medical Association (May 19, 1978) reported 
that of 215 male physicians who were Harvard graduates and died between November 
1974 and October 1977, the 125 in full-time practice lived an average of 48.9 years 
beyond graduation, whereas the 90 with academic affiliations lived an average of 
43.2 years beyond graduation. Does the data suggest that the mean lifetime after 
graduation for doctors in full-time practice exceeds the mean lifetime for those who 
have an academic affiliation? (If so, those medical students who say that they are 
“dying to obtain an academic affiliation” may be closer to the truth than they realize; 
in other words, is “publish or perish” really “publish and perish’’?) 

Let yz, denote the true average number of years lived beyond graduation for 
physicians in full-time practice, and let w, denote the same quantity for physicians 
with academic affiliations. Assume the 125 and 90 physicians to be random samples 
from populations | and 2, respectively (which may not be sensible if there is reason 
to believe that Harvard graduates have special characteristics that differentiate them 
from all other physicians—in this case inferences would be restricted just to the 
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“Harvard populations”). The letter from which the data was taken gave no informa- 
tion about variances, so for illustration assume that o, = 14.6 and o, = 14.4. The 
hypotheses are Hy: w,; — @, = O versus H,: 4, — , > 0, so Ap is zero. The computed 
value of the test statistic is 


48.9 — 43.2 5.70 
z= = = 2.85 
(14.6)? z (14.42 V1.70 + 2.30 

125 90 


The P-value for an upper-tailed test is 1 — ®(2.85) = .0022. At significance level 
.O1, Hy is rejected (because a > P-value) in favor of the conclusion that 
My — by > 0 (u, > ,). This is consistent with the information reported in the letter. 

This data resulted from a retrospective observational study; the investigator did 
not start out by selecting a sample of doctors and assigning some to the “academic 
affiliation” treatment and the others to the “full-time practice” treatment, but instead 
identified members of the two groups by looking backward in time (through obituar- 
ies!) to past records. Can the statistically significant result here really be attributed to 
a difference in the type of medical practice after graduation, or is there some other 
underlying factor (e.g., age at graduation, exercise regimens, etc.) that might also fur- 
nish a plausible explanation for the difference? Observational studies have been used 
to argue for a causal link between smoking and lung cancer. There are many studies 
that show that the incidence of lung cancer is significantly higher among smokers than 
among nonsmokers. However, individuals had decided whether to become smokers 
long before investigators arrived on the scene, and factors in making this decision may 
have played a causal role in the contraction of lung cancer. a 


A randomized controlled experiment results when investigators assign 
subjects to the two treatments in a random fashion. When statistical significance is 
observed in such an experiment, the investigator and other interested parties will have 
more confidence in the conclusion that the difference in response has been caused by 
a difference in treatments. A very famous example of this type of experiment and con- 
clusion is the Salk polio vaccine experiment described in Section 9.4. Various aspects 
of experimental and sampling design are discussed at greater length in the (nonmath- 
ematical) books by Moore and by Freedman et al., listed in the Chapter 1 references. 


B and the Choice of Sample Size 


The probability of a type II error is easily calculated when both population distributions 
are normal with known values of o, and o,. Consider the case in which the alternative 
hypothesis is H,: @; — fb, > Ao. Let A’ denote a value of jw, — pw, that exceeds Ay 
(a value for which H, is false). As with the upper-tailed z tests of Chapter 8, the inequal- 
ity P-value = a is equivalent to z = z, (the area captured in the upper tail of the z curve 
will be at most a if and only if the calculated z is on or to the right of the z critical value 
that captures area a). This in turn is equivalent to x — y = Ay + z,o0y_y. Thus 
B(A’) = P(not rejecting Hy when pw, — pb, = A’) 
= PX — Y< A, + z,07-, when pw, — p, = A’) 


When p, — p, = A’, X — Y is normally distributed with mean value A’ and stand- 


ard deviation oy_y (the same standard deviation as when H) is true); using these 
values to standardize the inequality in parentheses gives the desired probability. 
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Alternative Hypothesis (A’) = P(type II error when p, — p2 = A’) 


Rak 
Ai: by — My > Ao Dl eee 
Oz 
NA, 
Ai: hy — Py < Ao ee a 
ed Na 
H,: [by — Po % Ay ® Za/2 (on ~ Pl Zap. — Oo 


where o = oy_; = V(a;/m) + (03/n) 


EXAMPLE 9.3 Suppose that when 2, and pw, (the true average yield strengths for the two types of 

(Example 9.1 steel) differ by as much as 5, the probability of detecting such a departure from H) 

continued) (the power of the test) should be .90. Does a level .01 test with sample sizes m = 20 
and n = 25 satisfy this condition? The value of o for these sample sizes (the denomi- 
nator of z) was previously calculated as 1.34. The probability of a type II error for 
the two-tailed level .01 test when w, — w, = A’ = 5 is 


5-0 5-0 
ats) = o(2.58 =) v/ 2.58 =) 


= @(-1.15) — &(—6.31) = 1251 


It is easy to verify that B(—5) = .1251 also. Thus the power is 1 — B(5) = .8749. 
Because this is somewhat less than .9, slightly larger sample sizes should be used. &@ 


As in Chapter 8, sample sizes m and n can be determined that will satisfy both 
P(type I error) = a specified a and P(type II error when p, — w, = A’) = a speci- 
fied B. For an upper-tailed test, equating the previous expression for B(A’) to the 
specified value of B gives 


oy of (A'-A,)? 


m n (Zy + Zp) 


When the two sample sizes are equal, this equation yields 
(ot + O5)(Zq + Zp) 
m=n= 
(A’ — A,)? 


These expressions are also correct for a lower-tailed test, whereas a is replaced by 
a/2 for a two-tailed test. 


Large-Sample Tests 


The assumptions of normal population distributions and known values of o, and a, 
are fortunately unnecessary when both sample sizes are sufficiently large. In this 
case, the Central Limit Theorem guarantees that X — Y has approximately a nor- 
mal distribution regardless of the underlying population distributions. Furthermore, 
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using S} and S} in place of of and o} in Expression (9.1) gives a variable whose 
distribution is approximately standard normal: 


A large-sample test statistic results from replacing , — 1, by Ao, the expected 
value of X — Y when H, is true. This statistic Z then has approximately a standard 
normal distribution when H, is true, which allows for straightforward determination 
of a P-value as a z curve area. 


Use of the test statistic value 


x-y-Ay 
oe 

st 8 

m n 


along with the previously stated prescriptions for P-value determination 
gives large-sample tests whose significance levels are approximately a. 
These tests are usually appropriate if both m > 40 and n > 40. 


EXAMPLE 9.4 What impact does fast-food consumption have on various dietary and health charact- 
eristics? The article “Effects of Fast-Food Consumption on Energy Intake and 
Diet Quality Among Children in a National Household Study” (Pediatrics, 2004: 
112-118) reported the accompanying summary data on daily calorie intake both for 
a sample of teens who said they did not typically eat fast food and another sample 
of teens who said they did usually eat fast food. 


Eat Fast Food Sample Size Sample Mean Sample SD 
No 663 2258 1519 
Yes 413 2637 1138 


Does this data provide strong evidence for concluding that true average calorie intake 
for teens who typically eat fast food exceeds by more than 200 calories per day the 
true average intake for those who don’t typically eat fast food? Let’s investigate by 
carrying out a test of hypotheses at a significance level of approximately .05. 

The parameter of interest is 4, — ,, Where pw, is the true average calorie 
intake for teens who don’t typically eat fast food and pw, is true average intake for 
teens who do typically eat fast food. The hypotheses of interest are 


Hy: by — By = —200 versus H,: hw, — by, < —200 


The alternative hypothesis asserts that true average daily intake for those who typi- 
cally eat fast food exceeds that for those who don’t by more than 200 calories. The 
test statistic value is 


¥ — y — (—200) 
— 
ST 83 
Mm n 
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The calculated test statistic value is 


2258 — 2637 + 200-179 _ 


(1519) (1138)? 81.34 
663 413 


2.20 


The inequality in H, implies that P-value = ®(—2.20) = .0139 (a lower-tailed test). 
Since .0139 = .05, the null hypothesis is rejected. At a significance level of .05, it 
does appear that true average daily calorie intake for teens who typically eat fast 
food exceeds by more than 200 the true average intake for those who don’t typically 
eat such food. However, the P-value is not small enough to justify rejecting Ho at 
significance level .01. 

Notice that if the label | had instead been used for the fast-food condition and 
2 had been used for the no-fast-food condition, then 200 would have replaced —200 
in both hypotheses and H, would have contained the inequality >, implying an upper- 
tailed test. The resulting test statistic value would have been 2.20, giving the same 
P-value as before. re 


Confidence Intervals for uw, — > 


When both population distributions are normal, standardizing X — Y gives a ran- 
dom variable Z with a standard normal distribution. Since the area under the z curve 
between —z,,. and Z,/. is 1 — a, it follows that 
X-¥= (uy - my) 
Ph =Z.95-< a <z =l-a 
a /2 5 -) a /2 
Mtg 8 


m n 


Manipulation of the inequalities inside the parentheses to isolate w, — 2 yields the 
equivalent probability statement 


ae oO; 05 Be ee OT 05 
5 Goat Mee or ,, a My <X-— V+ Zu aR, 


This implies that a L0O(1 — a)% CI for pr, — py has lower limit x — y — Zy/) * OR_y 
and upper limit x — y + Z,/. * @y_y, Where oy_y is the square-root expression. This 
interval is a special case of the general formula 6 + z,/ ° 0%. 

If both m and n are large, the CLT implies that this interval is valid even 
without the assumption of normal populations; in this case, the confidence level is 
approximately 100(1 — a)%. Furthermore, use of the sample variances S} and S3 in 
the standardized variable Z yields a valid interval in which sj and s3 replace o7 and 03. 


Provided that m and n are both large, a CI for w, — ws, with a confidence level 
of approximately 100(1 — a)% is 

Seer ne! 

x~ y= La/2 m n 
where — gives the lower limit and + the upper limit of the interval. An upper 
or a lower confidence bound can also be calculated by retaining the appropri- 
ate sign (+ or —) and replacing z,/7 by Z,. 
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Our standard rule of thumb for characterizing sample sizes as large is m > 40 and 
n> 40. 


EXAMPLE 9.5 Enhanced heavy oil recovery uses steam delivered to the production zone. The 
annulus between rock formation and the metal casing pipe is filled with cement. The 
article “Thermal Stability of the Cement Sheath in Steam Treated Oil Wells” (J. 
of the Amer. Ceramic Soc., 2011: 4463-4470) reported on a study of cement sheath 
performance when various thermal cements were cured at 35 °C and then heated to 
230 °C. Here is summary data on Vicker’s hardness (MPa) for both a control cement 
and an experimental cement: 


Type Sample Size Sample Mean Sample SD 
Control 50 24.3 5.2 
Experimental 50 27.0 5.8 


Figure 9.1 shows a comparative boxplot of data consistent with these summary 
quantities. The main difference between the two samples appears to be where they 
are centered. 


Control e 


Exptl 


T T T T T T > Hardness 
10 15 20 25 30 35 40 


Figure 9.1 A comparative boxplot of the hardness data 


Let’s now calculate a confidence interval for the difference between true average 
hardness for the control cement (j1,) and true average hardness for the experimental 
cement (j1,) using a confidence level of 95%: 


24.3 — 27.0 = (1.96) 7 - OSs 2.7 + (1.96)(1.1016) 
: Ot (1. 50 50 tes : 


= —2.7 42.2 = (—439, —.5) 


That is, with 95% confidence, —4.9 < pr, — pb, < —.5. We can therefore be highly 
confident that true average hardness for the experimental cement exceeds that for the 
control cement by between .5 and 4.9 MPa. This CI does not include 0, so at the chosen 
confidence level, 0 is not a plausible value of ww, — fy. According to the relationship 
between CI’s and HT’s discussed in Section 8.5, the null hypothesis Hp: uw, — fb. = 0 
should be rejected in favor of H,: 4, — @, # 0 at significance level .05 (the P-value 
for this test given in the cited article is not in agreement with other summary data). 
Notice that if we relabel so that jz, refers to the experimental cement and p, 
to the control cement, the CI becomes (.5, 4.9). The interpretation of the interval is 
exactly the same as was that of the first interval. a 
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If the variances o7 and o% are at least approximately known and the investigator 
uses equal sample sizes, then the common sample size n that yields a 100(1 — a)% 


interval of width w is 


_ 4% (OF + 05) 


w- 


n 


which will generally have to be rounded up to an integer. 


EXERCISES Section 9.1 (1—16) 


An article in the November 1983 Consumer Reports 

compared various types of batteries. The average life- 

times of Duracell Alkaline AA batteries and Eveready 

Energizer Alkaline AA batteries were given as 4.1 hours 

and 4.5 hours, respectively. Suppose these are the popula- 

tion average lifetimes. 

a. Let X be the sample average lifetime of 100 Duracell 
batteries and Y be the sample average lifetime of 100 
Eveready batteries. What is the mean value of X — Y 
(i.e., where is the distribution of X — Y centered)? 
How does your answer depend on the specified 
sample sizes? 

b. Suppose the population standard deviations of life- 
time are 1.8 hours for Duracell batteries and 2.0 
hours for Eveready batteries. With the sample sizes 
given in part (a), what is the variance of the statistic 
X — Y, and what is its standard deviation? 

c. For the sample sizes given in part (a), draw a picture 
of the approximate distribution curve of X — Y 
(include a measurement scale on the horizontal axis). 
Would the shape of the curve necessarily be the same 
for sample sizes of 10 batteries of each type? 
Explain. 


The National Health Statistics Reports dated Oct. 22, 
2008, included the following information on the heights 
(in.) for non-Hispanic white females: 


Sample Sample Std. Error 


Age Size Mean Mean 
20-39 866 64.9 .09 
60 and older 934 63.1 mi | 


a. Calculate and interpret a confidence interval at con- 
fidence level approximately 95% for the difference 
between population mean height for the younger 
women and that for the older women. 

b. Let y, denote the population mean height for 
those aged 20-39 and y2, denote the population mean 
height for those aged 60 and older. Interpret the 
hypotheses Hp: w,— pb, = 1 and H,: pw, — pw, > 1, 
and then carry out a test of these hypotheses at sig- 
nificance level .001. 


c. Based on the P-value calculated in (b) would you 
reject the null hypothesis at any reasonable signifi- 
cance level? Explain your reasoning. 

d. What hypotheses would be appropriate if «1, referred to 
the older age group, «2, to the younger age group, and 
you wanted to see if there was compelling evidence for 
concluding that the population mean height for 
younger women exceeded that for older women by 
more than | in.? 


Pilates is a popular set of exercises for the treatment of 

individuals with lower back pain. The method has six 

basic principles: centering, concentration, control, preci- 
sion, flow, and breathing. The article “Efficacy of the 

Addition of Modified Pilates Exercises to a Minimal 

Intervention in Patients with Chronic Low Back 

Pain: A Randomized Controlled Trial’ (Physical 

Therapy, 2013: 309-321) reported on an experiment 

involving 86 subjects with nonspecific low back pain. 

The participants were randomly divided into two groups 

of equal size. The first group received just educational 

materials, whereas the second group participated in 

6 weeks of Pilates exercises. The sample mean level of 

pain (on a scale from 0 to 10) for the control group at a 

6-week follow-up was 5.2 and the sample mean for the 

treatment group was 3.1; both sample standard deviations 

were 2.3. 

a. Does it appear that true average pain level for the 
control condition exceeds that for the treatment con- 
dition? Carry out a test of hypotheses using a signifi- 
cance level of .01 (the cited article reported statistical 
significance at this a, and a sample mean difference 
of 2.1 also suggests practical significance). 

b. Does it appear that true average pain level for the 
control condition exceeds that for the treatment con- 
dition by more than 1? Carry out a test of appropriate 
hypotheses. 


Reliance on solid biomass fuel for cooking and heating 
exposes many children from developing countries to high 
levels of indoor air pollution. The article ‘“‘Domestic Fuels, 
Indoor Air Pollution, and Children’s Health’? (Annals 
of the N.Y. Academy of Sciences, 2008: 209-217) pre- 
sented information on various pulmonary characteristics in 
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samples of children whose households in India used either 

biomass fuel or liquefied petroleum gas (LPG). For the 755 

children in biomass households, the sample mean peak 

expiratory flow (a person’s maximum speed of expiration) 

was 3.30 L/s, and the sample standard deviation was 1.20. 

For the 750 children whose households used liquefied 

petroleum gas, the sample mean PEF was 4.25 and the 

sample standard deviation was 1.75. 

a. Calculate a confidence interval at the 95% confi- 
dence level for the population mean PEF for children 
in biomass households and then do likewise for chil- 
dren in LPG households. What is the simultaneous 
confidence level for the two intervals? 

b. Carry out a test of hypotheses at significance level 
.01 to decide whether true average PEF is lower for 
children in biomass households than it is for children 
in LPG households (the cited article included a 
P-value for this test). 

c. FEV,, the forced expiratory volume in | second, is 
another measure of pulmonary function. The cited 
article reported that for the biomass households the 
sample mean FEV, was 2.3 L/s and the sample stan- 
dard deviation was .5 L/s. If this information is used 
to compute a 95% CI for population mean FEV,, 
would the simultaneous confidence level for this 
interval and the first interval calculated in (a) be the 
same as the simultaneous confidence level deter- 
mined there? Explain. 


Persons having Reynaud’s syndrome are apt to suffer a 

sudden impairment of blood circulation in fingers and 

toes. In an experiment to study the extent of this impair- 
ment, each subject immersed a forefinger in water and 
the resulting heat output (cal/cm?/min) was measured. 

For m = 10 subjects with the syndrome, the average heat 

output was x = .64, and for n = 10 nonsufferers, the 

average output was 2.05. Let jz, and pw, denote the true 
average heat outputs for the two types of subjects. 

Assume that the two distributions of heat output are nor- 

mal with o, = .2 anda, = .4. 

a. Consider testing Hy: mw, — @, =—1.0 versus H,: 
by — fy <—1.0 at level .01. Describe in words what 
H, says, and then carry out the test. 

b. What is the probability of a type II error when the 
actual difference between pr, and ph, is Mb, — by = 
—1.2? 

c. Assuming that m = n, what sample sizes are required 
to ensure that 8B = .1 when w, — pw, = —1.2? 


An experiment to compare the tension bond strength of 
polymer latex modified mortar (Portland cement mortar to 
which polymer latex emulsions have been added during 
mixing) to that of unmodified mortar resulted in 
x = 18.12 kgf/cm? for the modified mortar (m = 40) and 
y = 16.87 kgf/cm? for the unmodified mortar (n = 32). 
Let 1, and j, be the true average tension bond strengths for 
the modified and unmodified mortars, respectively. Assume 
that the bond strength distributions are both normal. 
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a. Assuming that o, = 1.6 and o,= 1.4, test Hp: 
by — My = 0 versus H,: ww, — by > Oat level .01. 

b. Compute the probability of a type II error for the test 
of part (a) when p, — pb, = 1. 

c. Suppose the investigator decided to use a level .05 
test and wished B= .10 when p, — p, = 1. If 
m = 40, what value of n is necessary? 

d. How would the analysis and conclusion of part (a) 
change if o, and o, were unknown but s, = 1.6 and 
s,= 1.4? 


Is there any systematic tendency for part-time college 
faculty to hold their students to different standards 
than do full-time faculty? The article “Are There 
Instructional Differences Between Full-Time and 
Part-Time Faculty?” (College Teaching, 2009: 23-26) 
reported that for a sample of 125 courses taught by full- 
time faculty, the mean course GPA was 2.7186 and the 
standard deviation was .63342, whereas for a sample of 
88 courses taught by part-timers, the mean and standard 
deviation were 2.8639 and .49241, respectively. Does it 
appear that true average course GPA for part-time faculty 
differs from that for faculty teaching full-time? Test the 
appropriate hypotheses at significance level .01. 


Tensile-strength tests were carried out on two different 
grades of wire rod (“‘Fluidized Bed Patenting of Wire 
Rods,” Wire J., June 1977: 56-61), resulting in the 
accompanying data. 


Sample 
Sample Mean Sample 
Grade Size (kg/mm?) SD 
AISI 1064 m = 129 x = 107.6 s,=13 
AISI 1078 n= 129 y = 123.6 So = 2.0 


a. Does the data provide compelling evidence for con- 
cluding that true average strength for the 1078 grade 
exceeds that for the 1064 grade by more than 10 kg/ 
mm/?? Test the appropriate hypotheses using a sig- 
nificance level of .01. 

b. Estimate the difference between true average 
strengths for the two grades in a way that provides 
information about precision and reliability. 


The article “Evaluation of a Ventilation Strategy to 
Prevent Barotrauma in Patients at High Risk for 
Acute Respiratory Distress Syndrome” (New Engl. J. 
of Med., 1998: 355-358) reported on an experiment in 
which 120 patients with similar clinical features were 
randomly divided into a control group and a treatment 
group, each consisting of 60 patients. The sample mean 
ICU stay (days) and sample standard deviation for the 
treatment group were 19.9 and 39.1, respectively, where- 
as these values for the control group were 13.7 and 15.8. 
a. Calculate a point estimate for the difference 

between true average ICU stay for the treatment 

and control groups. Does this estimate suggest that 


10. 


11. 


12. 
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there is a significant difference between true aver- 
age stays under the two conditions? 

b. Answer the question posed in part (a) by carrying out 
a formal test of hypotheses. Is the result different 
from what you conjectured in part (a)? 

c. Does it appear that ICU stay for patients given the 
ventilation treatment is normally distributed? Explain 
your reasoning. 

d. Estimate true average length of stay for patients 
given the ventilation treatment in a way that conveys 
information about precision and reliability. 


An experiment was performed to compare the fracture 
toughness of high-purity 18 Ni maraging steel with 
commercial-purity steel of the same type (Corrosion 
Science, 1971: 723-736). For m = 32 specimens, the 
sample average toughness was x = 65.6 for the high- 
purity steel, whereas for n = 38 specimens of commer- 
cial steel y = 59.8. Because the high-purity steel is more 
expensive, its use for a certain application can be justified 
only if its fracture toughness exceeds that of commercial- 
purity steel by more than 5. Suppose that both toughness 
distributions are normal. 

a. Assuming that o, = 1.2 and a, = 1.1, test the rele- 

vant hypotheses using a = .001. 
b. Compute B for the test conducted in part (a) when 
My — Py = 6. 

The level of lead in the blood was determined for a 
sample of 152 male hazardous-waste workers ages 
20-30 and also for a sample of 86 female workers, 
resulting in a mean + standard error of 5.5 + 0.3 for 
the men and 3.8 +0.2 for the women (‘Temporal 
Changes in Blood Lead Levels of Hazardous Waste 
Workers in New Jersey, 1984-1987,” Environ. 
Monitoring and Assessment, 1993: 99-107). Calculate 
an estimate of the difference between true average blood 
lead levels for male and female workers in a way that 
provides information about reliability and precision. 


The accompanying summary data on total cholesterol 
level (mmol/l) was obtained from a sample of Asian post- 
menopausal women who were vegans and another sam- 
ple of such women who were omnivores (“*Vegetarianism, 
Bone Loss, and Vitamin D: A Longitudinal Study in 
Asian Vegans and Non-Vegans,’ European J. of 
Clinical Nutr., 2012: 75-82). 


Diet Sample Size Sample Mean Sample SD 
Vegan 88 5.10 1.07 
Omnivore 93 5:55 1.10 


Calculate and interpret a 99% CI for the difference 
between population mean total cholesterol level for 
vegans and population mean total cholesterol level for 
omnivores (the cited article included a 95% CI). [Note: 
The article described a more sophisticated statistical 
analysis for investigating bone density loss taking into 
account other characteristics (“covariates”) such as age, 


13. 


14. 


15. 


16. 


body weight, and various nutritional factors; the result- 
ing CI included 0, suggesting no diet effect. ] 


A mechanical engineer wishes to compare strength prop- 
erties of steel beams with similar beams made with a 
particular alloy. The same number of beams, n, of each 
type will be tested. Each beam will be set in a horizontal 
position with a support on each end, a force of 2500 lb 
will be applied at the center, and the deflection will be 
measured. From past experience with such beams, the 
engineer is willing to assume that the true standard 
deviation of deflection for both types of beam is .05 in. 
Because the alloy is more expensive, the engineer wishes 
to test at level .01 whether it has smaller average deflec- 
tion than the steel beam. What value of n is appropriate 
if the desired type II error probability is .05 when the 
difference in true average deflection favors the alloy by 
.04 in.? 


The level of monoamine oxidase (MAO) activity in 
blood platelets (nm/mg protein/h) was determined for 
each individual in a sample of 43 chronic schizophrenics, 
resulting in x = 2.69 and s, = 2.30, as well as for 45 
normal subjects, resulting in y = 6.35 and s, = 4.03. 
Does this data strongly suggest that true average MAO 
activity for normal subjects is more than twice the activ- 
ity level for schizophrenics? Derive a test procedure and 
carry out the test using a = .01. [Hint: H) and H, here 
have a different form from the three standard cases. 
Let yw, and mw, refer to true average MAO activity for 
schizophrenics and normal subjects, respectively, and 
consider the parameter 9 = 2, — p>. Write H) and H, 
in terms of 6, estimate 0, and derive 6% (‘‘Reduced 
Monoamine Oxidase Activity in Blood Platelets 
from Schizophrenic Patients,” Nature, July 28, 1972: 
225-226).] 


a. Show for the upper-tailed test with o, and o, 
known that as either m or n increases, B decreases 
when pL, — My > Ad. 

b. For the case of equal sample sizes (m = n) and fixed 
a, what happens to the necessary sample size n as B 
is decreased, where B is the desired type II error 
probability at a fixed alternative? 


To decide whether two different types of steel have the 
same true average fracture toughness values, n specimens 
of each type are tested, yielding the following results: 


Type Sample Average Sample SD 
1 60.1 1.0 
2 59.9 1.0 


Calculate the P-value for the appropriate two-sample 
z test, assuming that the data was based on n = 100. 
Then repeat the calculation for n = 400. Is the small 
P-value for n = 400 indicative of a difference that has 
practical significance? Would you have been satisfied 
with just a report of the P-value? Comment briefly. 
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9.2 The Two-Sample t Test and Confidence Interval 


Values of the population variances will usually not be known to an investigator. In 
the previous section, we illustrated for large sample sizes the use of a z test and CI in 
which the sample variances were used in place of the population variances. In fact, 
for large samples, the CLT allows us to use these methods even when the two popula- 
tions of interest are not normal. 

In practice, though, it will often happen that at least one sample size is 
small and the population variances have unknown values. Without the CLT at 
our disposal, we proceed by making specific assumptions about the underly- 
ing population distributions. The use of inferential procedures that follow from 
these assumptions is then restricted to situations in which the assumptions are 
at least approximately satisfied. We could, for example, assume that both popula- 
tion distributions are members of the Weibull family or that they are both Poisson 
distributions. It shouldn’t surprise you to learn that normality is often the most 
reasonable assumption. 


ASSUMPTIONS Both population distributions are normal, so that X,, X,,...,X,, is a random 
sample from a normal distribution and so is Y,,..., Y, (with the X’s and Y’s 
independent of one another). The plausibility of these assumptions can be judged 


by constructing a normal probability plot of the x,’s and another of the y,’s. 


The test statistic and confidence interval formula are based on the same standardized 
variable developed in Section 9.1, but the relevant distribution is now ¢ rather than z. 


THEOREM When the population distributions are both normal, the standardized variable 


yee 
T= (My — Py) (9.2) 
= aa a 


m n 


has approximately a f distribution with df v estimated from the data by 


ie st) 
— + = s 
- m n - [(se,)~ + (se)? 
(s?/m)? - (s3/n)? Gene ie (se,)* 
= Il n—-1 weil gil 
where 
Sy S> 
SCH) ae >» Séa = 
Ti 


(round v down to the nearest integer). 


Manipulating T in a probability statement to isolate uw, — mw, gives a CI, 
whereas a test statistic results from replacing 4, — pL by the null value A,. 
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The two-sample ¢ confidence interval for ~, — mw, with confidence level 


100(1 — a@)% is then 
eae SASS 
Co Sy Ee les 


A one-sided confidence bound can be calculated as described earlier. 
The two-sample ¢ test for testing Hy: uw, — pf. = Ap is as follows: 


ae x—y—A, 
Mest statisuc value: ¢ = 
ee es 
1 2 
as, + aes 
We ie 
Alternative Hypothesis P-Value Determination 
18L2 [Ui — [ity AY Area under the f, curve to the right of t 
JELS Np) — filing KS IN, Area under the f, curve to the left of t 
JEL [ig = lin Ge Ny 2 - (Area under the ¢, curve to the right of | f 1) 


Assumptions: Both population distributions are normal, and the two random 
samples are selected independently of one another. 


EXAMPLE 9.6 The void volume within a textile fabric affects comfort, flammability, and insulation 
properties. Permeability of a fabric refers to the accessibility of void space to the 
flow of a gas or liquid. The article “The Relationship Between Porosity and Air 
Permeability of Woven Textile Fabrics” (J. of Testing and Eval., 1997: 108-114) 
gave summary information on air permeability (cm?/cm7/sec) for a number of different 
fabric types. Consider the following data on two different types of plain-weave fabric: 


Fabric Type Sample Size Sample Mean Sample Standard Deviation 
Cotton 10 51.71 79 
Triacetate 10 136.14 3.59 


Assuming that the porosity distributions for both types of fabric are normal, let’s cal- 
culate a confidence interval for the difference between true average porosity for the 
cotton fabric and that for the acetate fabric, using a 95% confidence level. Before the 
appropriate f critical value can be selected, df must be determined: 


(a n el 
_ 10 10 _ 1.8258 _ dw 
(.6241/10)? x (12.8881/10)2 —.1850 
9 9 


Thus we use v = 9; Appendix Table A.5 gives to); = 2.262. The resulting interval is 


1.6241 12.8881 
51.71 — 136.14 = (2.262) 10 + 10 = —84.43 + 2.63 


= (— 87.06, —81.80) 
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With a high degree of confidence, we can say that true average porosity for triacetate 
fabric specimens exceeds that for cotton specimens by between 81.80 and 87.06 
cm?/cm/sec. a 


EXAMPLE 9.7 The deterioration of many municipal pipeline networks across the country is a grow- 
ing concern. One technology proposed for pipeline rehabilitation uses a flexible liner 
threaded through existing pipe. The article ‘‘Effect of Welding on a High-Density 
Polyethylene Liner” (J. of Materials in Civil Engr., 1996: 94-100) reported the 
following data on tensile strength (psi) of liner specimens both when a certain fusion 
process was used and when this process was not used. 


No fusion 2748 2700 2655 2822 2511 
3149 3257 3213 3220 2753 


m=10 x= 2902.8 8, =2773 
Fused 3027 3356S 3359S 3297'S 312529102889 =~. 2902 
n=8 y=3108.1 s, = 205.9 


Figure 9.2 shows normal probability plots from Minitab. The linear pattern in each 
plot supports the assumption that the tensile strength distributions under the two con- 
ditions are both normal. 


Ce ee ee ee ee 


Probability 
Probability 


2480 2580 2680 2780 2880 2980 3080 3180 3280 2880 2980 3080 3180 3280 3380 
Not fused Fused 


Figure 9.2 Normal probability plots from Minitab for the tensile strength data 


The authors of the article stated that the fusion process increased the average 
tensile strength. The message from the comparative boxplot of Figure 9.3 is not all 
that clear. Let’s carry out a test of hypotheses to see whether the data supports this 
conclusion. 


Type 2 4 4 kK 


Type 1 4 —— - 
| 1» Strength 


T T T T T T T T 
2500 2600 2700 2800 2900 3000 3100 3200 3300 3400 


Figure 9.3 A comparative boxplot of the tensile-strength data 
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1. Let pw, be the true average tensile strength of specimens when the no-fusion 
treatment is used and yz, denote the true average tensile strength when the 
fusion treatment is used. 


2. Ho: @, — b> = 0 (no difference in the true average tensile strengths for the two 
treatments) 


3. H,: by; — Mb, < 0 (true average tensile strength for the no-fusion treatment is 
less than that for the fusion treatment, so that the investiga- 
tors’ conclusion is correct) 


4. The null value is A, = 0, so the test statistic value is 


¥-5 
i= 
a8 
m n 


5. We now compute both the test statistic value and df for the test: 


2902.8 — 3108.1 =O ae 
Q77.3F , 205.9) 113.97 , 
10 8 
Using s?/m = 7689.529 and s2/n = 5299.351, 
(7689.529 + 5299.351)? _168,711,003.7 _ 


= 15.94 
(7689.529)°/9 + (5299.351)?/7 — 10,581,747.35 


so the test will be based on 15 df. 


6. Appendix Table A.8 shows that the area under the 15 df ¢ curve to the right of 
1.8 is .046, so the P-value for a lower-tailed test is also .046. The following 
Minitab output summarizes all the computations: 


Two-sample T for nofusion vs fused 


N Mean StDev SE Mean 
not fused 10 2903 277 88 
fused 8 3108 206 73 


95% C.I. for mu nofusion-mu fused: (-488, 38) 
t-Test mu not fused=mu fused (vs<): T=~-1.80 P=0.046 DF=15 


7. Using a significance level of .05, we can barely reject the null hypothesis in 
favor of the alternative hypothesis, confirming the conclusion stated in the 
article. However, someone demanding more compelling evidence might select 
a = .O1, a level for which H, cannot be rejected. 


If the question posed had been whether fusing increased true average strength by more 
than 100 psi, then the relevant hypotheses would have been H): @, — b@, = — 100 versus 
H,,: fy — fy < —100; that is, the null value would have been Ay = — 100. a 


Pooled t Procedures 


Alternatives to the two-sample ¢ procedures just described result from assuming not 
only that the two population distributions are normal but also that they have equal 
variances (oj = 03). That is, the two population distribution curves are assumed 
normal with equal spreads, the only possible difference between them being where 
they are centered. 
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Let o? denote the common population variance. Then standardizing X — Y 
gives 


a ial <= rine My) 


Cc oe 
——. + aie 
mon 


which has a standard normal distribution. Before this variable can be used as a basis 
for making inferences about 4, — f2,, the common variance must be estimated from 
sample data. One estimator of o7 is St, the variance of the m observations in the first 
sample, and another is S3, the variance of the second sample. Intuitively, a better esti- 
mator than either individual sample variance results from combining the two sample 
variances. A first thought might be to use (St + S3)/2. However, if m > n, then the 
first sample contains more information about o” than does the second sample, and 
an analogous comment applies if m < n. The following weighted average of the two 
sample variances, called the pooled (i.e., combined) estimator of a7, adjusts for any 
difference between the two sample sizes: 


m—1 n—-1 


—_ a) : 
Po omt+n-2 mt+tn-2 


~A 
Nr 


The first sample contributes m — 1 degrees of freedom to the estimate of a, and the 
second sample contributes n — | df, for a total of m + n — 2 df. Statistical theory 
says that if SF replaces o” in the expression for Z, the resulting standardized variable 
has a ¢ distribution based on m + n — 2 df. In the same way that earlier standard- 
ized variables were used as a basis for deriving confidence intervals and test pro- 
cedures, this t variable immediately leads to the pooled ¢ CI for estimating wu, — pL, 
and the pooled ¢ test for testing hypotheses about a difference between means. 

In the past, many statisticians recommended these pooled ¢ procedures over 
the two-sample ¢ procedures. The pooled f test, for example, can be derived from 
the likelihood ratio principle, whereas the two-sample ¢ test is not a likelihood ratio 
test. Furthermore, the significance level for the pooled ¢ test is exact, whereas it is 
only approximate for the two-sample ¢ test. However, recent research has shown that 
although the pooled f test does outperform the two-sample f¢ test by a bit (smaller B’s for 
the same a) when of = 03, the former test can easily lead to erroneous conclusions if 
applied when the variances are different. Analogous comments apply to the behavior of 
the two confidence intervals. That is, the pooled t procedures are not robust to violations 
of the equal variance assumption. 

It has been suggested that one could carry out a preliminary test of Hp: of = 05 
and use a pooled ¢ procedure if this null hypothesis is not rejected. Unfortunately, 
the usual “F test” of equal variances (Section 9.5) is quite sensitive to the assumption 
of normal population distributions—much more so than f procedures. We therefore 
recommend the conservative approach of using two-sample f procedures unless 
there is really compelling evidence for doing otherwise, particularly when the two 
sample sizes are different. 


Type II Error Probabilities 


Determining type II error probabilities (or equivalently, power = | — f)for the 
two-sample ¢ test is complicated. There does not appear to be any simple way to 
use the B curves of Appendix Table A.17. The most recent version of Minitab 
(Version 16) will calculate power for the pooled f test but not for the two-sample t 
test. However, the UCLA Statistics Department homepage (http://www.stat.ucla.edu) 
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permits access to a power calculator that will do this. For example, we specified 
m = 10,n = 8,0, = 300, 0 = 225 (these are the sample sizes for Example 9.7, 
whose sample standard deviations are somewhat smaller than these values of o, and 
o>) and asked for the power of a two-tailed level .05 test of Hy: w, — uw, = 0 when 
by — by = 100, 250, and 500. The resulting values of the power were .1089, .4609, and 
.9635 (corresponding to B = .89, .54, and .04), respectively. In general, B will decrease 
as the sample sizes increase, as a increases, and as 4, — (, moves farther from 0. The 
software will also calculate sample sizes necessary to obtain a specified value of power 


for a particular value of 4; — b>. 


EXERCISES Section 9.2 (17-35) 


17. 


18. 


Temp (°C) 


Determine the number of degrees of freedom for the two- 
sample f test or CI in each of the following situations: 
10,n = 10,5, = 5.0, s, = 6.0 

10,n = 15, 5, = 5.0, 5, = 6.0 

10,n = 15, 5, = 2.0, 5, = 6.0 

12,n = 24, 5, = 5.0, 5, = 6.0 


Which way of dispensing champagne, the traditional 
vertical method or a tilted beer-like pour, preserves more 
of the tiny gas bubbles that improve flavor and aroma? 
The following data was reported in the article “On the 
Losses of Dissolved CO, during Champagne Serving” 
(J. Agr. Food Chem., 2010: 8768-8775). 


a. m 
b. m 
Cc om 
d. m 


Type of Pour n Mean (g/L) SD 


18 Traditional 
18 Slanted 
12: Traditional 
12 Slanted 


4 4.0 
4 
4 
4 


wo 
bo 
wr wn 


19. 


20. 


Assume that the sampled distributions are normal. 

a. Carry out a test at significance level .01 to decide 
whether true average CO, loss at 18 °C for the tradi- 
tional pour differs from that for the slanted pour. 

b. Repeat the test of hypotheses suggested in (a) for the 
12° temperature. Is the conclusion different from that 
for the 18° temperature? Note: The 12° result was 
reported in the popular media. 


Suppose p, and yw, are true mean stopping distances at 
50 mph for cars of a certain type equipped with two 
different types of braking systems. Use the two-sample 
t test at significance level .01 to test Hy: bw, — b@, = —10 
versus H,: w, — fy. <—10 for the following data: 
m=6, x= 115.7,s, =5.03,n =6,y = 129.3, and 
S, = 5.38. 


Use the data of Exercise 19 to calculate a 95% CI for the 
difference between true average stopping distance for 
cars equipped with system 1 and cars equipped with 
system 2. Does the interval suggest that precise informa- 
tion about the value of this difference is available? 


21. 


22. 


Quantitative noninvasive techniques are needed for rou- 
tinely assessing symptoms of peripheral neuropathies, 
such as carpal tunnel syndrome (CTS). The article “A 
Gap Detection Tactility Test for Sensory Deficits 
Associated with Carpal Tunnel Syndrome” 
(Ergonomics, 1995: 2588-2601) reported on a test that 
involved sensing a tiny gap in an otherwise smooth sur- 
face by probing with a finger; this functionally resembles 
many work-related tactile activities, such as detecting 
scratches or surface defects. When finger probing was not 
allowed, the sample average gap detection threshold for 
m = 8 normal subjects was 1.71 mm, and the sample 
standard deviation was .53; for n = 10 CTS subjects, the 
sample mean and sample standard deviation were 2.53 
and .87, respectively. Does this data suggest that the true 
average gap detection threshold for CTS subjects exceeds 
that for normal subjects? State and test the relevant 
hypotheses using a significance level of .01. 


According to the article “Modeling and Predicting the 
Effects of Submerged Arc Weldment Process 
Parameters on Weldment Characteristics and Shape 
Profiles” (J. of Engr. Manuf., 2012: 1230-1240), the 
submerged arc welding (SAW) process is commonly 
used for joining thick plates and pipes. The heat affected 
zone (HAZ), a band created within the base metal during 
welding, was of particular interest to the investigators. 
Here are observations on depth (mm) of the HAZ both 
when the current setting was high and when it was lower. 


Non-high 1.04 1.15 1.23 1.69 1.92 
1.98 2.36 2.49 2.72 
1.37 1.43 1.57 1d 1.94 
2.06 2.5) 2.64 2.82 

High 1.55 2.02 2.02 2.05 239 
2.57 2.93 2.94 2.97 


a. Construct a comparative boxplot and comment on 
interesting features. 

b. Is it reasonable to use the two-sample ¢ test to test 
hypotheses about the difference between true aver- 
age HAZ depths for the two conditions? 
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23. 


24. 


Species 


c. Does it appear that true average HAZ depth is larger 
for the higher current condition than for the lower 
condition? Carry out a test of appropriate hypotheses 
using a significance level of .01. 


Fusible interlinings are being used with increasing fre- 
quency to support outer fabrics and improve the shape and 
drape of various pieces of clothing. The article 
“Compatibility of Outer and Fusible Interlining 
Fabrics in Tailored Garments” (Textile Res. J., 1997: 
137-142) gave the accompanying data on extensibility 
(%) at 100 gm/cm for both high-quality (H) fabric and 
poor-quality (P) fabric specimens. 


H 12 9 7 10 17 #17 Ll 9 17 
19 13 21 16 #18 #14 13 19 16 
8 20 1.7 16 23 2.0 


P 16 15 %I1 21 15 13 10 26 


a. Construct normal probability plots to verify the plau- 
sibility of both samples having been selected from 
normal population distributions. 

b. Construct a comparative boxplot. Does it suggest 
that there is a difference between true average 
extensibility for high-quality fabric specimens and 
that for poor-quality specimens? 

c. The sample mean and standard deviation for the high- 
quality sample are 1.508 and .444, respectively, and 
those for the poor-quality sample are 1.588 and .530. 
Use the two-sample f test to decide whether true aver- 
age extensibility differs for the two types of fabric. 


Damage to grapes from bird predation is a serious prob- 
lem for grape growers. The article “Experimental 
Method to Investigate and Monitor Bird Behavior 
and Damage to Vineyards” (Amer. J. of Enology and 
Viticulture, 2004: 288-291) reported on an experiment 
involving a bird-feeder table, time-lapse video, and arti- 
ficial foods. Information was collected for two different 
bird species at both the experimental location and at a 
natural vineyard setting. Consider the following data on 
time (sec) spent on a single visit to the location. 


Location n x SE mean 


Blackbirds 
Blackbirds 
Silvereyes 
Silvereyes 


13.4 

9.7 
49.4 
38.4 


2.05 
1.76 
4.78 
5.06 


Exptl 65 
Natural 50 
Exptl 34 
Natural 46 


a. Calculate an upper confidence bound for the true 
average time that blackbirds spend on a single visit 
at the experimental location. 

b. Does it appear that true average time spent by black- 
birds at the experimental location exceeds the true 
average time birds of this type spend at the natural 
location? Carry out a test of appropriate hypotheses. 

c. Estimate the difference between the true average 
time blackbirds spend at the natural location and true 
average time that silvereyes spend at the natural 
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25. 


26. 


Type N 


27. 


location, and do so in a way that conveys information 
about reliability and precision. 


[Note: The sample medians reported in the article all 
seemed significantly smaller than the means, suggesting 
substantial population distribution skewness. The authors 
actually used the distribution-free test procedure presented 
in Section 2 of Chapter 15.] 


The accompanying data consists of prices ($) for one 
sample of California cabernet sauvignon wines that 
received ratings of 93 or higher in the May 2013 issue of 
Wine Spectator and another sample of California caber- 
nets that received ratings of 89 or lower in the same issue. 


= 93: 100 100 60 135 195 195 


125: “135 95 42 75 72 


= 89: 80 75 i 85 75 35.85 
65 45 100 28 38 50 28 


Assume that these are both random samples of prices 

from the population of all wines recently reviewed that 

received ratings of at least 93 and at most 89, respectively. 

a. Investigate the plausibility of assuming that both 
sampled populations are normal. 

b. Construct a comparative boxplot. What does it sug- 
gest about the difference in true average prices? 

c. Calculate a confidence interval at the 95% confi- 
dence level to estimate the difference between p,, 
the mean price in the higher rating population, and 
/4), the mean price in the lower rating population. Is 
the interval consistent with the statement “Price 
rarely equates to quality” made by a columnist in the 
cited issue of the magazine? 


The article “The Influence of Corrosion Inhibitor and 
Surface Abrasion on the Failure of Aluminum- Wired 
Twist-On Connections” (IEEE Trans. on Components, 
Hybrids, and Manuf. Tech., 1984: 20-25) reported data 
on potential drop measurements for one sample of 
connectors wired with alloy aluminum and another sam- 
ple wired with EC aluminum. Does the accompanying 
SAS output suggest that the true average potential drop for 
alloy connections (type 1) is higher than that wfor EC con- 
nections (as stated in the article)? Carry out the appropri- 
ate test using a significance level of .01. In reaching your 
conclusion, what type of error might you have committed? 
[Note: SAS reports the P-value for a two-tailed test.] 


Std Dev 
0.55012821 
0.48998389 


Std Error 
0.12301241 
0.10956373 


Mean 
20 17.49900000 
20 16.90000000 


Prob>|T| 
0.0008 
0.0008 


Variances T DF 
Unequal 3.6362 37.5 
Equal 3.6362 38.0 


Anorexia Nervosa (AN) is a psychiatric condition leading 
to substantial weight loss among women who are fearful 
of becoming fat. The article “‘Adipose Tissue Distribution 
After Weight Restoration and Weight Maintenance in 
Women with Anorexia Nervosa” (Amer. J. of Clinical 


Nutr., 2009: 1132-1137) used whole-body magnetic 
resonance imagery to determine various tissue character- 
istics for both an AN sample of individuals who had 
undergone acute weight restoration and maintained their 
weight for a year and a comparable (at the outset of the 
study) control sample. Here is summary data on inter- 
muscular adipose tissue (IAT; kg). 


Condition Sample Size Sample Mean Sample SD 
AN 16 a2 26 
Control 8 35 ~lS 


28. 


29. 


Beverage 


Assume that both samples were selected from normal 
distributions. 

a. Calculate an estimate for true average IAT under 
the described AN protocol, and do so in a way that 
conveys information about the reliability and preci- 
sion of the estimation. 

b. Calculate an estimate for the difference between true 
average AN IAT and true average control IAT, and 
do so in a way that conveys information about the 
reliability and precision of the estimation. What does 
your estimate suggest about true average AN IAT 
relative to true average control IAT? 


As the population ages, there is increasing concern about 
accident-related injuries to the elderly. The article “Age 
and Gender Differences in Single-Step Recovery from 
a Forward Fall’ (J. of Gerontology, 1999: M44—M50) 
reported on an experiment in which the maximum lean 
angle—the farthest a subject is able to lean and still 
recover in one step—was determined for both a sample of 
younger females (21-29 years) and a sample of older 
females (67-81 years). The following observations are 
consistent with summary data given in the article: 


YF: 29, 34, 33, 27, 28, 32, 31, 34, 32, 27 
OF: 18, 15, 23, 13, 12 


Does the data suggest that true average maximum lean 
angle for older females is more than 10 degrees smaller 
than it is for younger females? State and test the relevant 
hypotheses at significance level .10. 


The article “Effect of Internal Gas Pressure on the 
Compression Strength of Beverage Cans and Plastic 
Bottles” (J. of Testing and Evaluation, 1993: 129-131) 
includes the accompanying data on compression strength 
(Ib) for a sample of 12-0z aluminum cans filled with 
strawberry drink and another sample filled with cola. 
Does the data suggest that the extra carbonation of cola 
results in a higher average compression strength? Base 
your answer on a P-value. What assumptions are neces- 
sary for your analysis? 


Sample 
Size 


Sample 
Mean 


Sample 
SD 


Strawberry drink 15 
Cola 15 


540 21 
554 15 


30. 


31. 


32. 


Age 


9.2 The Two-Sample t Test and Confidence Interval 381 
The article “Flexure of Concrete Beams Reinforced 
with Advanced Composite Orthogrids” (J. of Aerospace 
Engr., 1997: 7-15) gave the accompanying data on ulti- 
mate load (KN) for two different types of beams. 


Sample Sample Sample 
Type Size Mean SD 
Fiberglass grid 26 33.4 22 
Commercial 26 42.8 4.3 
carbon grid 


a. Assuming that the underlying distributions are nor- 
mal, calculate and interpret a 99% CI for the differ- 
ence between true average load for the fiberglass 
beams and that for the carbon beams. 

b. Does the upper limit of the interval you calculated in 
part (a) give a 99% upper confidence bound for the 
difference between the two p’s? If not, calculate 
such a bound. Does it strongly suggest that true aver- 
age load for the carbon beams is more than that for 
the fiberglass beams? Explain. 


Refer to Exercise 33 in Section 7.3. The cited article also 
gave the following observations on degree of polymer- 
ization for specimens having viscosity times concentra- 
tion in a higher range: 


429 430 430 431 436 

440 441 445 446 447 

a. Construct a comparative boxplot for the two sam- 
ples, and comment on any interesting features. 

b. Calculate a 95% confidence interval for the differ- 
ence between true average degree of polymerization 
for the middle range and that for the high range. 
Does the interval suggest that jz, and , may in fact 
be different? Explain your reasoning. 


437 


The degenerative disease osteoarthritis most frequently 
affects weight-bearing joints such as the knee. The article 
“Evidence of Mechanical Load Redistribution at the 
Knee Joint in the Elderly When Ascending Stairs and 
Ramps” (Annals of Biomed. Engr., 2008: 467-476) 
presented the following summary data on stance duration 
(ms) for samples of both older and younger adults. 


Sample Size Sample Mean Sample SD 


Older 28 
Younger 16 


801 117 
780 72 


33. 


Assume that both stance duration distributions are normal. 

a. Calculate and interpret a 99% CI for true average 
stance duration among elderly individuals. 

b. Carry out a test of hypotheses at significance level 
.05 to decide whether true average stance duration is 
larger among elderly individuals than among 
younger individuals. 


The article ‘The Effects of a Low-Fat, Plant-Based 
Dietary Intervention on Body Weight, Metabolism, 
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and Insulin Sensitivity in Postmenopausal Women” 
(Amer. J. of Med., 2005: 991-997) reported on the results 
of an experiment in which half of the individuals in a 
group of 64 postmenopausal overweight women were 
randomly assigned to a particular vegan diet, and the 
other half received a diet based on National Cholesterol 
Education Program guidelines. The sample mean decrease 
in body weight for those on the vegan diet was 5.8 kg, and 
the sample SD was 3.2, whereas for those on the control 
diet, the sample mean weight loss and standard deviation 
were 3.8 and 2.8, respectively. Does it appear the true 
average weight loss for the vegan diet exceeds that for the 


(see the Pooled t Procedures subsection for a descrip- 

tion of S,). 

a. Use this ¢ variable to obtain a pooled ¢ confidence 
interval formula for jw; — py. 

b. A sample of ultrasonic humidifiers of one particular 
brand was selected for which the observations on 
maximum output of moisture (oz) in a controlled 
chamber were 14.0, 14.3, 12.2, and 15.1. A sample 
of the second brand gave output values 12.1, 13.6, 
11.9, and 11.2 (“Multiple Comparisons of Means 
Using Simultaneous Confidence Intervals,” J. of 
Quality Technology, 1989: 232-241). Use the 


pooled t formula from part (a) to estimate the differ- 
ence between true average outputs for the two brands 
with a 95% confidence interval. 

c. Estimate the difference between the two p’s using 
the two-sample ¢ interval discussed in this section, 
and compare it to the interval of part (b). 


control diet by more than 1 kg? Carry out an appropriate 
test of hypotheses at significance level .05. 


34. Consider the pooled ¢ variable 


X=) = (iy =i 
pu! Y) — (1 = My) 
ce ee 35. 


y Refer to Exercise 34. Describe the pooled 1¢ test for 
mon 


testing Hy: @; — b> = Ap when both population distribu- 
tions are normal with 0, = a. Then use this test proce- 
dure to test the hypotheses suggested in Exercise 33. 


9.3 Analysis of Paired Data 


In Sections 9.1 and 9.2, we considered making an inference about a difference between 
two means p, and p,. This was done by utilizing the results of a random sample 
X,, X,,...X,, from the distribution with mean j1, and a completely independent (of the 
X’s) sample Y,,..., Y,, from the distribution with mean jz). That is, either m individuals 
were selected from population | and n different individuals from population 2, or m 
individuals (or experimental objects) were given one treatment and another set of n 
individuals were given the other treatment. In contrast, there are a number of experi- 
mental situations in which there is only one set of n individuals or experimental 
objects; making two observations on each one results in a natural pairing of values. 


which has a ¢ distribution with m+n—2 df when 
both population distributions are normal with o, = 0, 


EXAMPLE 9.8 Trace metals in drinking water affect the flavor, and unusually high concentrations 
can pose a health hazard. The article ““Trace Metals of South Indian River” (Envir. 
Studies, 1982: 62-66) reported on a study in which six river locations were selected 
(six experimental objects) and the zinc concentration (mg/L) determined for both 
surface water and bottom water at each location. The six pairs of observations are 
displayed in the accompanying table. Does the data suggest that true average con- 


centration in bottom water exceeds that of surface water? 


Location 
1 2 3 4 5 6 
Zinc concentration in 
bottom water (x) 430 .266 567 531 .707 .716 
Zinc concentration in 
surface water (y) 415 .238 390 410 .605 .609 
Difference 015 .028 177 121 102 107 
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Figure 9.4(a) displays a plot of this data. At first glance, there appears to be 
little difference between the x and y samples. From location to location, there 
is a great deal of variability in each sample, and it looks as though any differ- 
ences between the samples can be attributed to this variability. However, when 
the observations are identified by location, as in Figure 9.4(b), a different view 
emerges. At each location, bottom concentration exceeds surface concentration. 
This is confirmed by the fact that all x — y differences displayed in the bottom 
row of the data table are positive. A correct analysis of this data focuses on these 


differences. 
x | e i | e | @ e | jee | 
y T—e T elee T Teo T T 
. 2 3 4 5 6 a 8 
(a) 
Location x 2 i ; : 3 56 
Location y Te I elee T Teo t } 
: 2 341 56 
(b) 


Figure 9.4 Plot of paired data from Example 9.8: (a) observations not identified by location; (b) 
observations identified by location H 


ASSUMPTIONS The data consists of n independently selected pairs (X,, Y,), (X5, Y>),..-(X,» Y,)s 
with E(X) =p, and EY) =p, Let D, =X, — Y,,D, =X, — Y,,..., 
D,= X, — Y,,so the D,’s are the differences within pairs. The D,’s are assumed 
to be normally distributed with mean value jz, and variance o7, (this is usually a 


consequence of the X;’s and Y,’s themselves being normally distributed). 


We are again interested in making an inference about the difference wu, — My. 
The two-sample ¢ confidence interval and test statistic were obtained by assuming 
independent samples and applying the rule V(X — Y) = V(X) + V(Y). However, with 
paired data, the X and Y observations within each pair are often not independent. 
Then X and Y are not independent of one another. We must therefore abandon the 
two-sample ¢ procedures and look for an alternative method of analysis. 


The Paired t Test 


Because different pairs are independent, the D,’s are independent of one another. Let 
D =X — Y, where X and Y are the first and second observations, respectively, within 
an arbitrary pair. Then the expected difference is 


bp = EXX — Y) = EX) — EY) = py, — by 


(the rule of expected values used here is valid even when X and Y are dependent). 
Thus any hypothesis about jw, — mf, can be phrased as a hypothesis about the 
mean difference fry. But since the D,’s constitute a normal random sample (of 
differences) with mean 1p, hypotheses about jp, can be tested using a one-sample 
t test. That is, to test hypotheses about 4, — , when data is paired, form the differ- 
ences D,, D),..., D,, and carry out a one-sample t test (based on n — | df) on these 
differences. 
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The Paired t Test 


Null hypothesis: Hp: wp = Ao (where D=X—Y is the difference 
between the first and second observations 
within a pair, and wp = by, — [>) 

d— A, 


Test statistic value: tf (where d and s, are the sample mean and 


Sp/ Vn standard deviation, respectively, of the d;’s) 


Alternative Hypothesis P-Value Determination 

Jala Vj, > IN. Area under the ¢,_, curve to the right of ¢ 
Jalen << IN Area under the f,,_, curve to the left of r 
EL NG 2 - (Area under the ¢,,_ , curve to the right of Izl) 


Assumptions: The D;s constitute a random sample from a normal “difference” 
population. 


EXAMPLE 9.9 Musculoskeletal neck-and-shoulder disorders are all too common among office staff 
who perform repetitive tasks using visual display units. The article ““Upper-Arm 
Elevation During Office Work” (Ergonomics, 1996: 1221-1230) reported on 
a study to determine whether more varied work conditions would have any impact 
on arm movement. The accompanying data was obtained from a sample of n = 16 
subjects. Each observation is the amount of time, expressed as a proportion of total 
time observed, during which arm elevation was below 30°. The two measurements 
from each subject were obtained 18 months apart. During this period, work condi- 
tions were changed, and subjects were allowed to engage in a wider variety of work 
tasks. Does the data suggest that true average time during which elevation is below 
30° differs after the change from what it was before the change? 


Subject 1 2 3 4 5 6 7 8 
Before 81 87 86 82 90 86 96 73 
After 78 91 78 78 84 67 92 70 
Difference 3 —4 8 4 6 19 4 3 
Subject 9 10 11 12 13 14 15 16 
Before 74 75 72 80 66 72 56 82 
After 58 62 70 58 66 60 65 73 
Difference 16 13 2 22 0 12 9: 9 


Figure 9.5 shows a normal probability plot of the 16 differences; the pattern in the 
plot is quite straight, supporting the normality assumption. A boxplot of these dif- 
ferences appears in Figure 9.6; the boxplot is located considerably to the right of 
zero, suggesting that perhaps 4, > 0 (note also that 13 of the 16 differences are 
positive and only two are negative). 

Let’s now test the appropriate hypotheses. 


1. Let jw, denote the true average difference between elevation time before the 
change in work conditions and time after the change. 


2. Ho: bp = 0 (there is no difference between true average time before the change 
and true average time after the change) 


3. Ho: bp # 0 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


9.3 Analysis of Paired Data 385 
a Mean 6.75 
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Difference 


Figure 9.5 A normal probability plot from Minitab of the differences in Example 9.9 


— ei i i@@i i i $ ‘i“ i“ ccuir—= Difference 
-10 0 10 20 


Figure 9.6 A boxplot of the differences in Example 9.9 


,_d-0_ 4 

sp/Vn sp/Vn 

5. n = 16, Xd, = 108, and Sd? = 1746, from which d = 6.75, s) = 8.234, and 
= 6.95 

8.234/V/16 


6. Appendix Table A.8 shows that the area to the right of 3.3 under the ¢ curve 
with 15 df is .002. The inequality in H,, implies that a two-tailed test is appro- 
priate, so the P-value is approximately 2(.002) = .004 (Minitab gives .0051). 


= 3.28 = 3.3 


7. Since .004 < .01, the null hypothesis can be rejected at either significance 
level .05 or .01. It does appear that the true average difference between times is 
something other than zero; that is, true average time after the change is differ- 
ent from that before the change. es 


When the number of pairs is large, the assumption of a normal difference dis- 
tribution is not necessary. The CLT validates the resulting z test. 


The Paired t Confidence Interval 


In the same way that the ¢ CI for a single population mean yp is based on the ¢ vari- 
able T = (X — n)/(S/Vn), a t confidence interval for wp (= b, — M>) is based on 
the fact that 


D- 
7 aia) 
S,/Vn 


has a ft distribution with n — 1 df. Manipulation of this f variable, as in previous 
derivations of CI’s, yields the following 100(1 — a)% CTI: 
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The paired ¢ CI for pep is 
d z= a/2n—1 * sp/Vn 


A one-sided confidence bound results from retaining the relevant sign and 
replacing 1, /. by t,. 


When n is small, the validity of this interval requires that the distribution of differ- 
ences be at least approximately normal. For large n, the CLT ensures that the result- 
ing z interval is valid without any restrictions on the distribution of differences. 


EXAMPLE 9.10 Magnetic resonance imaging is a commonly used noninvasive technique for assessing 
the extent of cartilage damage. However, there is concern that the MRI sizing of articular 
cartilage defects may not be accurate. The article “‘Preoperative MRI Underestimates 
Articular Cartilage Defect Size Compared with Findings at Arthroscopic Knee 
Surgery” (Amer. J. of Sports Med., 2013: 590-595) reported on a study involving a 
sample of 92 cartilage defects. For each one, the size of the lesion area was determined 
by an MRI analysis and also during arthroscopic surgery. Each MRI value was then 
subtracted from the corresponding arthroscopic value to obtain a difference value. The 
sample mean difference was calculated to be 1.04 cm’, with a sample standard deviation 
of 1.67. Let’s now calculate a confidence interval using a confidence level of (at least 
approximately) 95% for fp, the mean difference for the population of all such defects 
(as did the authors of the cited article). Because 7 is quite large here, we use the z critical 
value Zo); = 1.96 (an entry at the very bottom of our ¢ table). The resulting CT is 


1.67 
1.04 = (1.96) - —= = 1.04 = .34 = (.70, 1.38) 


V92 


At the 95% confidence level, we believe that .70 < wy < 1.38. Perhaps the most 
interesting aspect of this interval is that 0 is not included; only certain positive values 
of 1p are plausible. It is this fact that led the investigators to conclude that MRIs 
tend to underestimate defect size. a 


Paired Data and Two-Sample t Procedures 
Consider using the two-sample ¢ test on paired data. The numerators of the two test sta- 
tistics are identical, since d = Xd,/n = [2(x,; — y)]/n = (2x, /n — 2y)/n =x — y. 
The difference between the statistics is due entirely to the denominators. Each test sta- 
tistic is obtained by standardizing X — Y (= D). But in the presence of dependence the 
two-sample f standardization is incorrect. To see this, recall from Section 5.5 that 

Vix + Y) = VX) + WY) + 2 Cov(X, Y) 
The correlation between X and Y is 


p = Corr(X, Y) = Cov(X, ¥)/[VV(X) > VV(Y)] 


It follows that 
V(X — Y) = 01 + o& — 2pa,0, 
Applying this to X — Y yields 


VD) G2) 0s 207.07, 


n n 


ae ae 1 
VX — Y) = VD) = v(iz0) = 
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The two-sample ¢ test is based on the assumption of independence, in which 
case p = 0. But in many paired experiments, there will be a strong positive depen- 
dence between X and Y (large X associated with large Y), so that p will be positive 
and the variance of X — Y will be smaller than o7/n + o3/n. Thus whenever there 
is positive dependence within pairs, the denominator for the paired t statistic should 
be smaller than for t of the independent-samples test. Often two-sample f will be 
much closer to zero than paired t, considerably understating the significance of the 
data. 

Similarly, when data is paired, the paired ¢ CI will usually be narrower than the 
(incorrect) two-sample ¢ CI. This is because there is typically much less variability 
in the differences than in the x and y values. 


Paired Versus Unpaired Experiments 


In our examples, paired data resulted from two observations on the same subject 
(Example 9.9) or experimental object (location in Example 9.8). Even when this can- 
not be done, paired data with dependence within pairs can be obtained by matching 
individuals or objects on one or more characteristics thought to influence responses. 
For example, in a medical experiment to compare the efficacy of two drugs for 
lowering blood pressure, the experimenter’s budget might allow for the treatment 
of 20 patients. If 10 patients are randomly selected for treatment with the first 
drug and another 10 independently selected for treatment with the second drug, an 
independent-samples experiment results. 

However, the experimenter, knowing that blood pressure is influenced by age 
and weight, might decide to create pairs of patients so that within each of the result- 
ing 10 pairs, age and weight were approximately equal (though there might be siz- 
able differences between pairs). Then each drug would be given to a different patient 
within each pair for a total of 10 observations on each drug. 

Without this matching (or “blocking”), one drug might appear to outperform 
the other just because patients in one sample were lighter and younger and thus more 
susceptible to a decrease in blood pressure than the heavier and older patients in the 
second sample. However, there is a price to be paid for pairing—a smaller number 
of degrees of freedom for the paired analysis—so we must ask when one type of 
experiment should be preferred to the other. 

There is no straightforward and precise answer to this question, but there 
are some useful guidelines. If we have a choice between two f¢ tests that are both 
valid (and carried out at the same level of significance a), we should prefer the 
test that has the larger number of degrees of freedom. The reason for this is that 
a larger number of degrees of freedom means smaller 8 for any fixed alterna- 
tive value of the parameter or parameters. That is, for a fixed type I error prob- 
ability, the probability of a type II error is decreased by increasing degrees of 
freedom. 

However, if the experimental units are quite heterogeneous in their responses, 
it will be difficult to detect small but significant differences between two treat- 
ments. This is essentially what happened in the data set in Example 9.8; for both 
“treatments” (bottom water and surface water), there is great between-location 
variability, which tends to mask differences in treatments within locations. If there 
is a high positive correlation within experimental units or subjects, the variance 
of D = X — Y will be much smaller than the unpaired variance. Because of this 
reduced variance, it will be easier to detect a difference with paired samples than 
with independent samples. The pros and cons of pairing can now be summarized 
as follows. 
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1. If there is great heterogeneity between experimental units and a large corre- 
lation within experimental units (large positive p), then the loss in degrees 
of freedom will be compensated for by the increased precision associated 
with pairing, so a paired experiment is preferable to an independent-samples 
experiment. 


S 


If the experimental units are relatively homogeneous and the correlation 
within pairs is not large, the gain in precision due to pairing will be out- 
weighed by the decrease in degrees of freedom, so an independent-samples 
experiment should be used. 


Of course, values of 07, 03, and p will not usually be known very precisely, so an 
investigator will be required to make an educated guess as to whether Situation | or 2 
obtains. In general, if the number of observations that can be obtained is large, then a 
loss in degrees of freedom (e.g., from 40 to 20) will not be serious; but if the number 
is small, then the loss (say, from 16 to 8) because of pairing may be serious if not com- 
pensated for by increased precision. Similar considerations apply when choosing 
between the two types of experiments to estimate 4, — 2, with a confidence interval. 


EXERCISES Section 9.3 (36—48) 


36. Consider the accompanying data on breaking load House 
(kg/25 mm width) for various fabrics in both an 
unabraded condition and an abraded condition (“The 10 11 12) 13 14 #15 «#16 © «17 
Effect of Wet Abrasive Wear on the Tensile Properties 
of Cotton and Polyester-Cotton Fabrics,” J. Testing Indoor AS 17 AT 18 8 1B BD 


and Evaluation, 1993: 84-93). Use the paired t test, as Outdoor 28 320 320 «155 66.29.21 :1.02 
did the authors of the cited article, to test Hy: wp = 0 


versus H,: lp > 0 at significance level .01. House 
Fabric 18 19 20 21 22 23 24 25 
1 9: 3 4 5 6 7 8 Indoor 20 22 22 23 ©2325) «6.26 28 
Outdoor 1.59 .90 52 .12 54 88 49 1.24 
U 364 55.0 51.5 38.7 43.2 48.8 25.6 49.8 
A 285 20.0 46.0 345 365 52.5 265 46.5 House 
37. Hexavalent chromium has been identified as an 26 27 28 29 300 31 32 33 


inhalation carcinogen and an air toxin of concern in a 
number of different locales. The article ‘Airborne 
Hexavalent Chromium in Southwestern Ontario” (J. 
of Air and Waste Mgmnt. Assoc., 1997: 905-910) gave 
the accompanying data on both indoor and outdoor 
concentration (nanograms/m*) for a sample of houses 
selected from a certain region. 


Indoor 28 29 34 39 40 45 54 .62 
Outdoor 48 27) 37) 1.26 £70) £76 99S —=#386 


a. Calculate a confidence interval for the population 
mean difference between indoor and outdoor con- 
centrations using a confidence level of 95%, and 
interpret the resulting interval. 

Fuse b. If a 34th house were to be randomly selected from 

1 2 3 4 5 e 7 8 9 the population, between what values would you pre- 

dict the difference in concentrations to lie? 


Indoor 07 08 09 12 12 12 13 14.15 38. Adding computerized medical images to a database 
Outdoor .29 68 47 54 97 35 49  .84 .86 promises to provide great resources for physicians. 
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39. 


However, there are other methods of obtaining such 
information, so the issue of efficiency of access needs to 
be investigated. The article ‘The Comparative 
Effectiveness of Conventional and Digital Image 
Libraries”(J. of Audiovisual Media in Medicine, 2001: 
8-15) reported on an experiment in which 13 computer- 
proficient medical professionals were timed both while 
retrieving an image from a library of slides and while 
retrieving the same image from a computer database with 
a Web front end. 


Subject 1 2 3 4 5 6 7 
Slide 30.35 40 25 20 30 35 
Digital 25. 16 15) 15 10 20 7 
Difference 5 19 25. 10 10 10 28 
Subject 8 9 10 11 12 13 
Slide 62 40 51 25 42 33 
Digital 16 15 13. 11 19 19 


Difference 46 25 38 «14 23. «14 


a. Construct a comparative boxplot of times for the two 
types of retrieval, and comment on any interesting 
features. 

b. Estimate the difference between true average times for 
the two types of retrieval in a way that conveys infor- 
mation about precision and reliability. Be sure to check 
the plausibility of any assumptions needed in your 
analysis. Does it appear plausible that the true average 
times for the two types of retrieval are identical? Why 
or why not? 


Scientists and engineers frequently wish to compare two 
different techniques for measuring or determining the 
value of a variable. In such situations, interest centers on 
testing whether the mean difference in measurements is 
zero. The article “Evaluation of the Deuterium Dilution 
Technique Against the Test Weighing Procedure for 
the Determination of Breast Milk Intake” (Amer. J. of 
Clinical Nutr., 1983: 996-1003) reports the accompany- 
ing data on amount of milk ingested by each of 14 ran- 
domly selected infants. 


Infant 
1 2 3 4 5 
DD method 1509 1418 1561 1556 2169 
TW method 1498 1254 1336 =©1565 2000 
Difference 11 164 225 —9 169 
Infant 
6 a 8 9 10 
DD method 1760 1098 =1198 1479 =1281 
TW method 1318 1410 1129 1342 1124 
Difference 442 —312 69 137 157 


40. 


9.3 Analysis of Paired Data 389 
Infant 
11 12 13 14 
DD method 1414 1954 2174 2058 
TW method 1468 1604 1722 1518 
Difference —54 350 452 540 


a. Is it plausible that the population distribution of dif- 
ferences is normal? 

b. Does it appear that the true average difference 
between intake values measured by the two methods 
is something other than zero? Determine the P-value 
of the test, and use it to reach a conclusion at sig- 
nificance level .05. 


Lactation promotes a temporary loss of bone mass to 
provide adequate amounts of calcium for milk produc- 
tion. The paper “Bone Mass Is Recovered from 
Lactation to Postweaning in Adolescent Mothers with 
Low Calcium Intakes” (Amer. J. of Clinical Nutr., 
2004: 1322-1326) gave the following data on total body 
bone mineral content (TBBMC) (g) for a sample both 
during lactation (L) and in the postweaning period (P). 


Subject 


1 2 3. 4 5 6 7 8 9 10 


L 1928 2549 2825 1924 1628 2175 2114 2621 1843 2541 
P 2126 2885 2895 1942 1750 2184 2164 2626 2006 2627 


41. 


a. Does the data suggest that true average total body 
bone mineral content during postweaning exceeds 
that during lactation by more than 25 g? State and 
test the appropriate hypotheses using a significance 
level of .05. [Note: The appropriate normal proba- 
bility plot shows some curvature but not enough to 
cast substantial doubt on a normality assumption. ] 

b. Calculate an upper confidence bound using a 95% 
confidence level for the true average difference 
between TBBMC during postweaning and during 
lactation. 

c. Does the (incorrect) use of the two-sample t test to 
test the hypotheses suggested in (a) lead to the same 
conclusion that you obtained there? Explain. 


Antipsychotic drugs are widely prescribed for condi- 
tions such as schizophrenia and bipolar disease. The 
article “Cardiometabolic Risk of Second- 
Generation Antipsychotic Medications During 
First-Time Use in Children and Adolescents” (J. of 
the Amer. Med. Assoc., 2009) reported on body com- 
position and metabolic changes for individuals who 
had taken various antipsychotic drugs for short peri- 
ods of time. 
a. The sample of 41 individuals who had taken aripipra- 
zole had a mean change in total cholesterol (mg/dL) of 
3.75, and the estimated standard error sp/V/n was 
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42. 


43. 


3.878. Calculate a confidence interval with confidence 
level approximately 95% for the true average increase 
in total cholesterol under these circumstances (the cited 
article included this CI). 

b. The article also reported that for a sample of 36 indi- 
viduals who had taken quetiapine, the sample mean 
cholesterol level change and estimated standard error 
were 9.05 and 4.256, respectively. Making any neces- 
sary assumptions about the distribution of change in 
cholesterol level, does the choice of significance level 
impact your conclusion as to whether true average 
cholesterol level increases? Explain. [Note: The arti- 
cle included a P-value.] 

c. For the sample of 45 individuals who had taken olanza- 
pine, the article reported (7.38, 9.69) as a 95% CI for 
true average weight gain (kg). What is a 99% CI? 


Many freeways have service (or logo) signs that give 
information on attractions, camping, lodging, food, and 
gas services prior to off-ramps. These signs typically do 
not provide information on distances. The article 
“Evaluation of Adding Distance Information to 
Freeway-Specific Service (Logo) Signs” (J. of 
Transp. Engr., 2011: 782-788) reported that in one 
investigation, six sites along Virginia interstate high- 
ways where service signs are posted were selected. For 
each site, crash data was obtained for a three-year 
period before distance information was added to the 
service signs and for a one-year period afterward. The 
number of crashes per year before and after the sign 
changes were as follows: 


Before: 15 26 66 115 62 64 
After: 16 24 42 80 78 73 


a. The cited article included the statement “A paired f test 
was performed to determine whether there was any 
change in the mean number of crashes before and after 
the addition of distance information on the signs.” 
Carry out such a test. [Note: The relevant normal 
probability plot shows a substantial linear pattern.] 

b. Ifa seventh site were to be randomly selected among 
locations bearing service signs, between what values 
would you predict the difference in number of 
crashes to lie? 


Cushing’s disease is characterized by muscular weakness 
due to adrenal or pituitary dysfunction. To provide effec- 
tive treatment, it is important to detect childhood Cushing’s 
disease as early as possible. Age at onset of symptoms and 
age at diagnosis (months) for 15 children suffering from 
the disease were given in the article ‘““Ireatment of 
Cushing’s Disease in Childhood and Adolescence by 
Transphenoidal Microadenomectomy” (New Engl. J. 
of Med., 1984: 889). Here are the values of the differ- 
ences between age at onset of symptoms and age at diag- 
nosis: 


24 12 35: 15 30 60 14 21 
48 12 25 33 61 69 80 
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a. Does the accompanying normal probability plot cast 
strong doubt on the approximate normality of the 
population distribution of differences? 


Difference 


z percentile 


-15 -.5 5 1.5 


b. Calculate a lower 95% confidence bound for the 
population mean difference, and interpret the 
resulting bound. 

c. Suppose the (age at diagnosis) — (age at onset) differ- 
ences had been calculated. What would be a 95% 
upper confidence bound for the corresponding popu- 
lation mean difference? 


44. Refer back to the previous exercise. 


a. By far the most frequently tested null hypothesis 
when data is paired is Hp: wp = 0. Is that a sensible 
hypothesis in this context? Explain. 

b. Carry out a test of hypotheses to decide whether 
there is compelling evidence for concluding that on 
average diagnosis occurs more than 25 months after 
the onset of symptoms. 


45. Torsion during hip external rotation (ER) and exten- 


sion may be responsible for certain kinds of injuries in 
golfers and other athletes. The article “Hip Rotational 
Velocities During the Full Golf Swing” (J. of Sports 
Science and Medicine, 2009: 296-299) reported on a 
study in which peak ER velocity and peak IR (internal rota- 
tion) velocity (both in deg.sec~') were determined for a 
sample of 15 female collegiate golfers during their swings. 
The following data was supplied by the article’s authors. 


Golfer ER IR diff Z perc 
1 — 130.6 —98.9 —31.7 —1.28 
2 —125.1 —115.9 —9,2 —0.97 
3 —51.7 —161.6 109.9 0.34 
4 —179.7 —196.9 72: —0.73 
5 — 130.5 —170.7 40.2 —0.34 
6 —101.0 —274.9 173.9 0.97 
7 —24.4 —275.0 250.6 1.83 
8 —231.1 —275.7 44.6 —0.17 
9 — 186.8 —214.6 27.8 —0.52 

10 —58.5 —117.8 59.3 0.00 

11 —219.3 —326.7 107.4 0.17 

12 —113.1 —272.9 159.8 0.73 

13 —244,3 —429.1 184.8 1.28 

14 — 184.4 — 140.6 —43.8 —1.83 

15 —199,.2 —345.6 146.4 0.52 


46. 


9.4 


a. Is it plausible that the differences came from a nor- 
mally distributed population? 

b. Thearticle reported that mean (+ SD) =—145.3(68.0) 
for ER velocity and = —227.8(96.6) for IR velocity. 
Based just on this information, could a test of hypoth- 
eses about the difference between true average IR 
velocity and true average ER velocity be carried out? 
Explain. 

c. The article stated that “The lead hip peak IR velocity was 
significantly greater than the trail hip ER velocity 
(p=0.003, t value =3.65).’ (The phrasing suggests 
that an upper-tailed test was used.) Is that in fact the case? 
[Note: “p = .033” in Table 2 of the article is erroneous.] 


Example 7.11 gave data on the modulus of elasticity 
obtained 1 minute after loading in a certain configura- 
tion. The cited article also gave the values of modulus of 
elasticity obtained 4 weeks after loading for the same 
lumber specimens. The data is presented here. 


Observation 1 min 4 weeks Difference 
1 10,490 9,110 1380 
2 16,620 13,250 3370 
3 17,300 14,720 2580 
4 15,480 12,740 2740 
5 12,970 10,120 2850 
6 17,260 14,570 2690 
7 13,400 11,220 2180 
8 13,900 11,100 2800 
9 13,630 11,420 2210 

10 13,260 10,910 2350 
11 14,370 12,110 2260 
12 11,700 8,620 3080 
13 15,470 12,590 2880 
14 17,840 15,090 2750 
15 14,070 10,550 3520 
16 14,760 12,230 2530 


Inferences Concerning a Difference Between Population Proportions 


47. 


48. 


391 


Calculate and interpret an upper confidence bound for the 
true average difference between 1-minute modulus and 
4-week modulus; first check the plausibility of any neces- 
sary assumptions. 


The article “Slender High-Strength RC Columns 
Under Eccentric Compression” (Magazine of Concrete 
Res., 2005: 361-370) gave the accompanying data on 
cylinder strength (MPa) for various types of columns 
cured under both moist conditions and laboratory drying 
conditions. 


Type 
1 2 3 4 5 6 
M: 82.6 87.1 89.5 88.8 94.3 80.0 
LD: 86.9 87.3 92.0 89.3 91.4 85.9 
7 8 9 10 11 12 
M: 86.7 92.5 97.8 90.4 94.6 91.6 
LD: 894 91.8 94.3 92.0 93:1. “9453 


a. Estimate the difference in true average strength 
under the two drying conditions in a way that con- 
veys information about reliability and precision, and 
interpret the estimate. What does the estimate sug- 
gest about how true average strength under moist 
drying conditions compares to that under laboratory 
drying conditions? 

b. Check the plausibility of any assumptions that 
underlie your analysis of (a). 


Construct a paired data set for which t = ©, so that the 
data is highly significant when the correct analysis is 
used, yet ¢ for the two-sample ¢ test is quite near zero, 
so the incorrect analysis yields an insignificant result. 


9.4 Inferences Concerning a Difference Between 


Population Proportions 


Having presented methods for comparing the means of two different populations, we 
now turn attention to the comparison of two population proportions. Regard an indi- 
vidual or object as a success S if he/she/it possesses some characteristic of interest 
(someone who graduated from college, a refrigerator with an icemaker, etc.). Let 


Pp, = the proportion of S’s in population # 1 


P> = the proportion of S’s in population # 2 


Alternatively, p,(p,) can be regarded as the probability that a randomly selected 
individual or object from the first (second) population is a success. 

Suppose that a sample of size m is selected from the first population and inde- 
pendently a sample of size n is selected from the second one. Let X denote the number 
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of S’s in the first sample and Y be the number of $’s in the second. Independence of the 
two samples implies that X and Y are independent. Provided that the two sample sizes 
are much smaller than the corresponding population sizes, X and Y can be regarded 
as having binomial distributions. The natural estimator for p, — p,, the difference 
in population proportions, is the corresponding difference in sample proportions 
X/m— Y/n. 


PROPOSITION Let p, = X/mand p, = Y/n, where X ~ Bin(m, p,) and Y ~ Bin(n, p,) with 
X and Y independent variables. Then 

E@, — P.) = P, ~ P2 
SO Pp, — Pp, is an unbiased estimator of p, — p,, and 


x A Pid: | P2942 
Vip Ee) ot 


(where g; = 1! — p}) (9.3) 


Proof SinceE(X) = mp, and E(Y) = np,, 


EF xX Y 1 EQ) ay) 
ray ae i Pm Pe Pi ~ P2 


Since V(X) = mp,q,, VY) = npzq>, and X and Y are independent, 
xX Y x Y 1 1 
V =y—\|oyl=|—=— ye Svnae | 
n ne nr 


m n m m n 


We will focus first on situations in which both m and n are large. Then because 
Pp, and p, individually have approximately normal distributions, the estimator 
P, — Pz also has approximately a normal distribution. Standardizing p, — p, yields 
a variable Z whose distribution is approximately standard normal: 


Py — Py — (Pi — Pr) 
Pi 4 Pode 
m n 


Z= 


A Large-Sample Test Procedure 


The most general null hypothesis an investigator might consider would be of the 
form Hp: p; — Pp» = Ao. Although for population means the case A, # 0 presented 
no difficulties, for population proportions A) = 0 and A, # 0 must be considered 
separately. Since the vast majority of actual problems of this sort involve A) = 0 (i.e., 
the null hypothesis p, = p,), we’ll concentrate on this case. When Hp: p, — p, = 0 
is true, let p denote the common value of p, and p, (and similarly for g). Then the 
standardized variable 

jee ee (9.4) 


1 de 1 

1 | aaa 
has approximately a standard normal distribution when H, is true. However, this Z 
cannot serve as a test statistic because the value of p is unknown—A#) asserts only 


that there is a common value of p, but does not say what that value is. A test statistic 
results from replacing p and g in (9.4) by appropriate estimators. 
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Assuming that p, = p, = p, instead of separate samples of size m and n from 
two different populations (two different binomial distributions), we really have 
a single sample of size m +n from one population with proportion p. The total 
number of individuals in this combined sample having the characteristic of interest 
is X + Y. The natural estimator of p is then 

~ XtY m a n 4 
= = “pit * Py (9.5) 


m+n 


m+n m+n 


The second expression for p shows that it is actually a weighted average of estima- 
tors p, and p, obtained from the two samples. Using p and g = 1 — p in place of p 
and q in (9.4) gives a test statistic having approximately a standard normal distribu- 
tion when A is true. 


Null hypothesis: H,: p, — p, = 0 
Test statistic value (large samples): z= ans 
pe 
Fees 
pal + 3) 
Alternative Hypothesis P-Value Determination 
ESD pO Area under the standard normal curve to the 
right of z 
H,: Pp; — Pp, < 0 Area under the standard normal curve to the 
left of z 
lobe = jay ea W) 2 - (Area under the standard normal curve to 
the right of Izl) 
The test can safely be used as long as mp,, mq,, np, and nq, are all at least 10. 


EXAMPLE 9.11 The article “Aspirin Use and Survival After Diagnosis of Colorectal Cancer” 
(J. of the Amer. Med. Assoc., 2009: 649-658) reported that of 549 study partici- 
pants who regularly used aspirin after being diagnosed with colorectal cancer, there 
were 81 colorectal cancer-specific deaths, whereas among 730 similarly diagnosed 
individuals who did not subsequently use aspirin, there were 141 colorectal cancer- 
specific deaths. Does this data suggest that the regular use of aspirin after diagnosis 
will decrease the incidence rate of colorectal cancer-specific deaths? Let’s test the 
appropriate hypotheses using a significance level of .05. 

The parameter of interest is the difference p, — p,, where p, is the true pro- 
portion of deaths for those who regularly used aspirin and p, is the true proportion of 
deaths for those who did not use aspirin. The use of aspirin is beneficial if p, < p,, 
which corresponds to a negative difference between the two proportions. The rel- 
evant hypotheses are therefore 


Hy: Pp; — P2 = 9 versus H,: p; — po <9 
Parameter estimates are p, = 81/549 = .1475, p, = 141/730 = .1932, and 


p = (81 + 141)/(549 + 730) = .1736. A z test is appropriate here because all of 
mp,, Mq,, NP>, and nq, are at least 10. The resulting test statistic value is 


1475 — .1932 -.0457 
: 1 Ty 021397 
1736)(.8264){ —— + —~ 
een (sis =) 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


394 CHAPTER 9 Inferences Based on Two Samples 


The corresponding P-value for a lower-tailed z test is ®(—2.14) = .0162. Because 
.0162 = .05, the null hypothesis can be rejected at significance level .05. So anyone 
adopting this significance level would be convinced that the use of aspirin in these cir- 
cumstances is beneficial. However, someone looking for more compelling evidence 
might select a significance level .01 and then not be persuaded. a 


Type II Error Probabilities and Sample Sizes 


Here the determination of B is a bit more cumbersome than it was for other large- 
sample tests. The reason is that the denominator of Z is an estimate of the standard 
deviation of p — p,, assuming that p, = p, = p. When H) is false, p, — p, must be 


restandardized using 
O55, = [Pi 4 P29 (9.6) 
1 2 m n 


The form of o implies that B is not a function of just p, — pj, so we denote it by 


B(P;, P2)- 
Alternative Hypothesis B(Py Po) 
pal + )-@-po 
H; Pi —P,>0 @p ca Pq m n 1 P2 
o 
See 
Hi, Pi —P, <0 | =a La Lae | as n 1 P2 
Co 
SN ere 
He pppoe 0 ] 20/2 P Nin" n Cre 
Co 
ee eee 
@ Za/2\/P aon (P; — Po) 
o 
where p = (mp, + np,)/(m + n), q = (mq, + ng,)/(m + n), and o is given 
by (9.6). 


Proof For the upper-tailed test (H,: p; — p, > 0), 
# ..% afl 1 
BP}, P2) = P| Py — Py <Zay/P9 a hs 


1 1 
i ak Za n+) - 0 — Pr) 
= P| ( — P2 — P — P2)) 2 ( ” . 


Oo Oo 
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When m and n are both large, 
P = (mp, + np,)/(m + n) ~ (mp, + np,)/(m + n) = p 
and g ~ q, which yields the previous (approximate) expression for B(p,, p,). a 
Alternatively, for specified p,, p, with p, — p, = d, the sample sizes necessary 
to achieve B(p,, p,) = 6 can be determined. For example, for the upper-tailed test, 


we equate —Z, to the argument of ®(-) (i.e., what’s inside the parentheses) in the 
foregoing box. If m = n, there is a simple expression for the common value. 


For the case m =n, the level a test has type II error probability B at the 
alternative values p,, p, with p, — p, = d when 


[zai + po(qy + q,)/2] ap ZV Did ar Pxp| 
n= PP (9.7) 


for an upper- or lower-tailed test, with a/2 replacing a for a two-tailed test. 


EXAMPLE 9.12 One of the truly impressive applications of statistics occurred in connection with the 
design of the 1954 Salk polio-vaccine experiment and analysis of the resulting data. 
Part of the experiment focused on the efficacy of the vaccine in combating paralytic 
polio. Because it was thought that without a control group of children, there would 
be no sound basis for assessment of the vaccine, it was decided to administer the 
vaccine to one group and a placebo injection (visually indistinguishable from the 
vaccine but known to have no effect) to a control group. For ethical reasons and also 
because it was thought that the knowledge of vaccine administration might have an 
effect on treatment and diagnosis, the experiment was conducted in a double-blind 
manner. That is, neither the individuals receiving injections nor those administering 
them actually knew who was receiving vaccine and who was receiving the placebo 
(samples were numerically coded). (Remember: at that point it was not at all clear 
whether the vaccine was beneficial.) 

Let p, and p, be the probabilities of a child getting paralytic polio for 
the control and treatment conditions, respectively. The objective was to test 
Hy: Pp; — Pp» = 9 versus H,: p, — p, > 0 (the alternative states that a vaccinated 
child is less likely to contract polio than an unvaccinated child). Supposing the true 
value of p, is .0003 (an incidence rate of 30 per 100,000), the vaccine would be 
a significant improvement if the incidence rate was halved—that is, p, = .00015. 
Using a level a = .05 test, it would then be reasonable to ask for sample sizes for 
which 6 = .1 when p, = .0003 and p, = .00015. Assuming equal sample sizes, the 
required n is obtained from (9.7) as 


[1.645\/(5)(.00045)(1.99955) + 1.28\/(.00015)(.99985) + (.0003)(.9997) | 
aa (.0003 — .00015)2 
= [(.0349 + .0271)/.00015}2 ~ 171,000 


The actual data for this experiment follows. Sample sizes of approximately 
200,000 were used. The reader can easily verify that z = 6.43—a highly significant 
value. The vaccine was judged a resounding success! 


Placebo: m = 201,229, x = number of cases of paralytic polio = 110 
Vaccine: n = 200,745, y = 33 a] 
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A Large-Sample Confidence Interval 


As with means, many two-sample problems involve the objective of compari- 
son through hypothesis testing, but sometimes an interval estimate for p,; — p, is 
appropriate. Both p, = X/m and p, = Y/n have approximate normal distributions 
when m and n are both large. If we identify @ with p, — p,, then 0 = p, — p, satisfies 
the conditions necessary for obtaining a large-sample CI. In particular, the estimated 
standard deviation of 6 is V( P19,/m) + (Poq>/n). The general 100(1 — a)% interval 
6+ Zy/2 ° 6 then takes the following form. 


A CI for p,; — p, with confidence level approximately 100(1 — a)% is 


A : Pp 
—p,+ Sop Se 
P\ ~ P2 = Za/2 hi 


This interval can safely be used as long as mp,, mq,, np, and nq, are all at 
least 10. 


Notice that the estimated standard deviation of p, — p, (the square-root expression) 
is different here from what it was for hypothesis testing when A, = 0. 

Recent research has shown that the actual confidence level for the traditional 
CI just given can sometimes deviate substantially from the nominal level (the level 
you think you are getting when you use a particular z critical value—e.g., 95% when 
Zq/2 = 1.96). The suggested improvement is to add one success and one failure to each 
of the two samples and then replace the p’s and q’s in the foregoing formula by p’s 
and qg’s where p, = (x + 1)/(m + 2), etc. This modified interval can also be used 
when sample sizes are quite small. 


EXAMPLE 9.13 Do people who work long hours have more trouble sleeping? An investigation 
into this issue was described in the article “Long Working Hours and Sleep 
Disturbances: The Whitehall II Prospective Cohort Study” (Sleep, 2009: 737- 
745). In one sample of 1501 British civil servants who worked more than 40 hours a 
week, 750 said they usually get less than 7 hours of sleep per night. In another sam- 
ple of 958 British civil servants who worked between 35 and 40 hours per week, 407 
said they usually get less than 7 hours of sleep per night. The investigators believed 
that these samples were representative of the populations to which they belong. 

Let p, denote the proportion of British civil servants working more than 40 hours 
per week who usually get less than 7 hours of sleep per night, and let p, be the corre- 
sponding proportion for the 35-40 hours population. The point estimates of p, and p, are 


from which q, = .500, g, = .575. All quantities mp,, mq,, np», ng> are much larger 
than 10, so the large-sample CI for p, — p, can be used. The 99% interval is 


500 — .425 + 2.58 ae) + eo) O75 + (2.58)(.020534) 
: —_ 1501 958 _ 


= 0.75 + 0.53 = (.022, .128) 


At the 99% confidence level, we estimate that the proportion of those working longer 
hours who usually get less than 7 hours of sleep per night exceeds the corresponding 
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proportion for those who work fewer hours by between .022 and .128. The fact 
that this interval includes only positive values suggests that those who work longer 
hours tend to get less sleep. But the study is observational rather than randomized 
controlled, so it would be dangerous to infer a causal relationship between work 
hours and amount of sleep. Because of the large sample sizes, the modified interval 
that uses p,, g,, P,, and q, is identical to the one we calculated. i) 


Small-Sample Inferences 


On occasion an inference concerning p, — p, may have to be based on samples for 
which at least one sample size is small. Appropriate methods for such situations 
are not as straightforward as those for large samples, and there is more controversy 
among statisticians as to recommended procedures. One frequently used test, called 
the Fisher—Irwin test, is based on the hypergeometric distribution. Your friendly 
neighborhood statistician can be consulted for more information. 


EXERCISES Section 9.4 (49-58) 


49. Consider the following two questions designed to assess b. If the true proportions of non-contaminated chickens 
quantitative literacy: for the Perdue and Tyson brands are .50 and .25, 
a. What is 15% of 1000? respectively, how likely is it that the null hypothesis of 
b. A store is offering a 15% off sale on all TVs. The equal proportions will be rejected when a .01 signifi- 

most popular television is normally priced at $1000. cance level is used and the sample sizes are both 80? 
ane cae ase pul Sccesomen cave ana 51. It is well known that a placebo, a fake medication or 
television during this sale? ? ss : 
treatment, can sometimes have a positive effect just 
Suppose the first question is asked of 200 randomly because patients often expect the medication or treat- 
selected college students, with 164 answering correctly; ment to be helpful. The article “Beware the Nocebo 
the second one is asked of a different random sample of Effect”’ (New York Times, Aug. 12, 2012) gave examples 
200 college students, resulting in 140 correct responses of a less familiar phenomenon, the tendency for patients 
(the sample percentages agree with those given in the informed of possible side effects to actually experience 
article “Using the Right Yardstick: Assessing Financial those side effects. The article cited a study reported in 
Literacy Measures by Way of Financial Well-Being,” The Journal of Sexual Medicine in which a group of 
J. of Consumer Affairs, 2013: 243-262; the investigators patients diagnosed with benign prostatic hyperplasia was 
found that those who answered such questions correctly, randomly divided into two subgroups. One subgroup of 
particularly questions with context, were significantly size 55 received a compound of proven efficacy along 
more successful in their investment decisions than those with counseling that a potential side effect of the treat- 
who did not answer correctly). Carry out a test of hypoth- ment was erectile dysfunction. The other subgroup of 
eses at significance level .05 to decide if the true propor- size 52 was given the same treatment without counseling. 
tion of correct responses to the question without context The percentage of the no-counseling subgroup that 
exceeds that for the one with context. reported one or more sexual side effects was 15.3%, 

50. Recent incidents of food contamination have caused great whereas 43.6% of the counseling subgroup reported ab 
concern among consumers. The article “How Safe Is That least one sexual side effect. State and test the appropriate 
Chicken?” (Consumer Reports, Jan. 2010: 19-23) repor- hypotheses at significance level .05 to decide whether the 
ted that 35 of 80 randomly selected Perdue brand broilers nocebo effect is operating here. [N ote: The estimated 
tested positively for either campylobacter or salmonella (or expected number of “successes” in the no-counseling 
both), the leading bacterial causes of food-borne disease, sample is a bit shy of 10, but not by enough bo be of great 
whereas 66 of 80 Tyson brand broilers tested positive. concern (some sources use a less conservative cutoff of 
a. Does it appear that the true proportion of non- Sey ed 

contaminated Perdue broilers differs from that for the 52. Do teachers find their work rewarding and satisfying? 


Tyson brand? Carry out a test of hypotheses using a 
significance level .01. 


The article “Work-Related Attitudes” (Psychological 
Reports, 1991: 443-450) reports the results of a survey 
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53. 


54. 


55. 


of 395 elementary school teachers and 266 high school 
teachers. Of the elementary school teachers, 224 said 
they were very satisfied with their jobs, whereas 126 of 
the high school teachers were very satisfied with their 
work. Estimate the difference between the proportion of 
all elementary school teachers who are very satisfied and 
all high school teachers who are very satisfied by calcu- 
lating and interpreting a CI. 


Olestra is a fat substitute approved by the FDA for use in 
snack foods. Because there have been anecdotal reports of 
gastrointestinal problems associated with olestra con- 
sumption, a randomized, double-blind, placebo-controlled 
experiment was carried out to compare olestra potato 
chips to regular potato chips with respect to GI symptoms 

(“Gastrointestinal Symptoms Following Consumption 

of Olestra or Regular Triglyceride Potato Chips,” J. of 

the Amer. Med. Assoc., 1998: 150-152). Among 529 

individuals in the TG control group, 17.6% experienced 

an adverse GI event, whereas among the 563 individuals 
in the olestra treatment group, 15.8% experienced such 
an event. 

a. Carry out a test of hypotheses at the 5% significance 
level to decide whether the incidence rate of GI prob- 
lems for those who consume olestra chips according 
to the experimental regimen differs from the inci- 
dence rate for the TG control treatment. 


b. If the true percentages for the two treatments were 
15% and 20%, respectively, what sample sizes 
(m = n) would be necessary to detect such a differ- 
ence with probability .90? 


Teen Court is a juvenile diversion program designed to 
circumvent the formal processing of first-time juvenile 
offenders within the juvenile justice system. The article 
“An Experimental Evaluation of Teen Courts” (J. of 
Experimental Criminology, 2008: 137-163) reported on 
a study in which offenders were randomly assigned 
either to Teen Court or to the traditional Department of 
Juvenile Services method of processing. Of the 56 TC 
individuals, 18 subsequently recidivated (look it up!) 
during the 18-month follow-up period, whereas 12 of the 
51 DJS individuals did so. Does the data suggest that the 
true proportion of TC individuals who recidivate during 
the specified follow-up period differs from the propor- 
tion of DJS individuals who do so? State and test the 
relevant hypotheses using a significance level of .10. 


In medical investigations, the ratio 9 = p,/p, is often of 
more interest than the difference p, — p, (e.g., individu- 
als given treatment 1 are how many times as likely to 
recover as those given treatment 2?). Let 6 = P,/Po- 
When m and n are both large, the statistic In(6) has 
approximately a normal distribution with approximate 
mean value In(@) and approximate standard deviation 
[(m — x)/(mx) + (n= y)/(ny)}"?. 

a. Use these facts to obtain a large-sample 95% CI for- 

mula for estimating In(@), and then a CI for 0 itself. 
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56. 


57. 


58. 


b. Return to the heart-attack data of Example 1.3, and 
calculate an interval of plausible values for 6 at the 
95% confidence level. What does this interval sug- 
gest about the efficacy of the aspirin treatment? 


Sometimes experiments involving success or failure 
responses are run in a paired or before/after manner. 
Suppose that before a major policy speech by a political 
candidate, n individuals are selected and asked whether 
(S) or not (F) they favor the candidate. Then after the 
speech the same n people are asked the same question. 
The responses can be entered in a table as follows: 


After 
S F 
Before a ecm: 
Hy: | Ke 


where x, +x, +x,+x,=n. Let p),p.,p3; and p, 
denote the four cell probabilities, so that p, = P(S before 
and S after), and so on. We wish to test the hypothesis that 
the true proportion of supporters (S) after the speech has 
not increased against the alternative that it has increased. 


a. State the two hypotheses of interest in terms of p,, p>, 
Px, and py. 

b. Construct an estimator for the after/before difference 
in success probabilities. 

c. When vis large, it can be shown that the rv (x; — X)) /n 
has approximately a normal distribution with variance 
given by [p; + p; — (p; — p;)"1/n. Use this to construct 
a test statistic with approximately a standard normal 
distribution when Ho is true (the result is called 
McNemar’s test). 

d. If x, =350, x, = 150, 
what do you conclude? 


x3 = 200, and x, = 300, 


Two different types of alloy, A and B, have been used to 
manufacture experimental specimens of a small tension link 
to be used in a certain engineering application. The ultimate 
strength (ksi) of each specimen was determined, and the 
results are summarized in the accompanying frequency 
distribution. 


A B 
26 —- < 30 6 4 
30 — < 34 12 9 
34 —-— <38 15 19 
38 — <42 7 10 

m = 40 m = 42 


Compute a 95% CI for the difference between the true 
proportions of all specimens of alloys A and B that have 
an ultimate strength of at least 34 ksi. 


Using the traditional formula, a 95% CI for p, — p, is to 
be constructed based on equal sample sizes from the two 
populations. For what value of n (= m) will the resulting 
interval have a width at most of .1, irrespective of the 
results of the sampling? 
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9.5 Inferences Concerning Two Population Variances 


Methods for comparing two population variances (or standard deviations) are occa- 
sionally needed, though such problems arise much less frequently than those involv- 
ing means or proportions. For the case in which the populations under investigation 
are normal, the procedures are based on a new family of probability distributions. 


The F Distribution 


The F probability distribution has two parameters, denoted by v, and v,. The param- 
eter v, is called the number of numerator degrees of freedom, and v, is the number 
of denominator degrees of freedom; here v, and v, are positive integers. A random 
variable that has an F distribution cannot assume a negative value. Since the density 
function is complicated and will not be used explicitly, we omit the formula. There is 
an important connection between an F variable and chi-squared variables. If X, and 
X, are independent chi-squared rv’s with v, and v, df, respectively, then the rv 
ae 
X,/V, 
(the ratio of the two chi-squared variables divided by their respective degrees of 
freedom), can be shown to have an F distribution. 

Figure 9.7 illustrates the graph of a typical F density function. Analogous to 
the notation 7,,, and x2, we use F,,,», for the value on the horizontal axis that 
captures a of the area under the F density curve with v, and v, df in the upper tail. 
The density curve is not symmetric, so it would seem that both upper- and lower-tail 
critical values must be tabulated. This is not necessary, though, because of the fact that 
F pense. a Nae 


(9.8) 


F density curve with 
¥ and % df 


Shaded area = a 


Bi 


av {V2 


~ f 


Figure 9.7. An Fdensity curve and critical value 


Appendix Table A.9 gives Ei for a=.10, .05, .01, and .001, and 
various values of v, (in different columns of the table) and v, (in different groups of 
rows of the table). For example, F'95.49 = 3.22 and Fos 19 = 4.06. The critical value 
F 056,19 Which captures .95 of the area to its right (and thus .05 to the left) under the 


F curve with v, = 6 and v, = 10, is Fo56 19 = 1/Fos 106 = 1/4.06 = .246. 


The F Test for Equality of Variances 


A test procedure for hypotheses concerning the ratio 07/03 is based on the following 
result. 
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THEOREM Let X,,..., X,, be a random sample from a normal distribution with variance 
oy, let Y,,..., Y, be another random sample (independent of the X,’s) from a 
normal distribution with variance 04, and let Si and S} denote the two sample 
variances. Then the rv 


_ Si/oy 
55/03 


has an F distribution with v, =m — 1 andv, =n -— 1. 


(9.9) 


This theorem results from combining (9.8) with the fact that the variables 
(m — 1)Si/o7 and (n — 1)S3/035 each have a chi-squared distribution with m — 1 
and n — 1 df, respectively (see Section 7.4). Because F involves a ratio rather than a 
difference, the test statistic is the ratio of sample variances. The claim that of = 03 
is implausible if the ratio differs by too much from 1. 

Recall that the P-value for an upper-tailed ¢ test is the area under an appropri- 
ate ¢ curve to the right of the calculated t, whereas for a lower-tailed test the P-value 
is the area under the curve to the left of t. Analogously, the P-value for an upper- 
tailed F test is the area under an appropriate F curve (the one with specified numera- 
tor and denominator dfs) to the right of f, and the P-value for a lower-tailed test is the 
area under an F curve to the left of f Because f curves are symmetric, the P-value 
for a two-tailed test is double the captured lower tail area if t is negative and double 
the captured upper tail area if t is positive. Although F curves are not symmetric, by 
analogy the P-value for a two-tailed F test is twice the captured lower tail area if f 
is below the median and twice the captured upper tail area if it is above the median. 
Figure 9.8 illustrates this for an upper-tailed test based on v, = 4 and v, = 6. 


F density curve for 
Va =4,1,=6 


Shaded area = P-value 
= .025 


ry 


ak 


f= 6.23 


Figure 9.8 A P-value for an upper-tailed F test 


Null hypothesis: Hy: 07 = 03 


Test statistic value: f= s7/s5 


Alternative Hypothesis P-Value Determination 

Aon 10% Ap = Area under the F,,_, ,; curve to the 
right of f 

On 102 A, = Area under the F,,_, ,,_; curve to the 
left of f 

H,: oj 4 0 2 - min(Ag, A,) 


Assumption: The population distributions are both normal, and the two ran- 
dom samples are independent of one another. 
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Tabulation of F-curve upper-tail areas is much more cumbersome than for t¢ 
curves because two df’s are involved. For each combination of v, and v,, our F table 
gives only the four critical values that capture areas .10, .05, .01, and .001. Because 
of this, the table will generally provide only an upper or lower bound (or both) on 
the P-value. For example, suppose the test is upper-tailed and based on 4 numera- 
tor df and 6 denominator df. If f = 5.82, then the P-value is the area under the F’, 
curve to the right of 5.82. Because Fy; 46 = 4.53, the area to the right of 4.53 is by 
definition .05. Similarly, Fo, 4 = 9.15 implies that the area under the curve to the 
right of this value is .01. Since 5.82 lies in between 4.53 and 9.15, the area to the 
right of 5.82 must be between .01 and .05. That is, .01 < P-value < .05. Figure 9.9 
shows what can be said about the P-value depending on where f falls relative to the 
four relevant tabulated critical values. 


Vy a 1 4 
6.10 3.18 
05 4.53 
O1 9.15 
001 21.92 


a 


P-value > .10 .O1 < P-value < .05 .001 < P-value < .01 P-value < .001 


.05 < P-value < .10 


Figure 9.9 Obtaining P-value information from the F table for an upper-tailed F test 


Again considering a test with v, = 4 and v, = 6, 


f= 5.82 = .01 < P-value < .05 
f=216 = P-value > .10 
f = 25.03 => P-value < .001 


Only if f equals a tabulated value do we obtain an exact P-value (e.g., if f= 4.53, 
then P-value = .05). Once we know that .01 < P-value < .05, Hy would be rejected 
at a significance level of .05 but not at a level of .01. When P-value < .001, H, 
should be rejected at any reasonable significance level. 

The F tests discussed in succeeding chapters will all be upper-tailed. If, how- 
ever, a lower-tailed F test is appropriate, then lower-tailed critical values should be 
obtained as described earlier so that a bound or bounds on the P-value can be estab- 
lished. In the case of a two-tailed test, the bound or bounds from a one-tailed test 
should be multiplied by 2. For example, if f= 5.82 when v, = 4 and v, = 6, then 
since 5.82 falls between the .05 and .01 critical values, 2(.01) < P-value < 2(.05), 
giving .02 < P-value < .10. H, would then be rejected if a = .10 but not if a = .01. 
In this case, we cannot say from our table what conclusion is appropriate when 
a = .05 (since we don’t know whether the P-value is smaller or larger than this). 
However, statistical software shows that the area to the right of 5.82 under this F 
curve is .029, so the P-value is .058 and the null hypothesis should therefore not be 
rejected at level .05. Various statistical software packages will, of course, provide an 
exact P-value for any F test. 
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EXAMPLE 9.14 A random sample of 200 vehicles traveling on gravel roads in a county with a posted 
speed limit of 35 mph on such roads resulted in a sample mean speed of 37.5 mph and a 
sample standard deviation of 8.6 mph, whereas another random sample of 200 vehicles 
in a county with a posted speed limit of 55 mph resulted in a sample mean and sample 
standard deviation of 35.8 mph and 9.2 mph, respectively (these means and standard 
deviations were reported in the article ‘Evaluation of Criteria for Setting Speed 
Limits on Gravel Roads” (J. of Transp. Engr., 2011: 57-63); the actual sample sizes 
result in dfs that exceed the largest of those in our F table). Let’s carry out a test at signifi- 
cance level .10 to decide whether the two population distribution variances are identical. 


1. oj is the variance of the speed distribution on the 35 mph roads, and 3 is the 
variance of the speed distribution on 55 mph roads. 


Hy: 07 = 0 

H,: 07 7 05 

Test statistic value: f = s7/s5 

Calculation: f = (8.6)?/(9.2)? = .87 

P-value determination: .87 lies in the lower tail of the F curve with 199 numerator df 
and 199 denominator df. A glance at the F table shows that F’ 19 199,199 ~ F’.10,200,200 ~ 
1.20 (consult the v, = 120 and v, = 1000 columns), implying F 99 199,199 ~ 1/1.20 = 
.83 (these values are confirmed by software). That is, the area under the relevant F 
curve to the left of .83 is .10. Thus the area under the curve to the left of .87 exceeds 
.10, and so P-value > 2(.10) = .2 (software gives .342). 

7. The P-value clearly exceeds the mandated significance level. The null hypoth- 


esis therefore cannot be rejected; it is plausible that the two speed distribution 
variances are identical. 


AW RY SN 


The sample sizes in the cited article were 2665 and 1868, respectively, and the P-value 
reported there was .0008. So for the actual data, the hypothesis of equal variances would 
be rejected not only at significance level .10—in contrast to our conclusion—but also 
at level .05, .01, and even .001. This illustrates again how quite large sample sizes can 
magnify a small difference in estimated values. Note also that the sample mean speed 
for the county with the lower posted speed limit was higher than for the county with the 
lower limit, a counterintuitive result that surprised the investigators; and because of the 
very large sample sizes, this difference in means is highly statistically significant. & 


A Confidence Interval for a ,/a, 
The CI for 07/03 is based on replacing F in the probability statement 


PY eho! oe D sfibin cig) =e 


by the F variable (9.9) and manipulating the inequalities to isolate 07/03. An interval 


for o/c, results from taking the square root of each limit. The details are left for 
an exercise. 


EXERCISES Section 9.5 (59-66) 


59. Obtain or compute the following quantities: f. The Ist percentile of the F distribution with 
a. Fossg be Foss C+ Fossg d+ Foss vy, = 10, v, = 12 
e. The 99th percentile of the F distribution with g. P(F < 6.16) forv, = 6,1, =4 
v, = 10, v, = 12 h. P(.177 = F = 4.74) for v, = 10, v. = 5 
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Give as much information as you can about the P-value 
of the F test in each of the following situations: 

a. v, =5,v, = 10, upper-tailed test, f = 4.75 

b. v, =5,v, = 10, upper-tailed test, f = 2.00 

c. v, =5, Vv, = 10, two-tailed test, f= 5.64 

d. v, =5,v, = 10, lower-tailed test, f= .200 

e. v, = 35, v, = 20, upper-tailed test, f= 3.24 


Return to the data on maximum lean angle given in 
Exercise 28 of this chapter. Carry out a test at significance 
level .10 to see whether the population standard deviations 
for the two age groups are different (normal probability 
plots support the necessary normality assumption). 


Refer to Example 9.7. Does the data suggest that the 
standard deviation of the strength distribution for fused 
specimens is smaller than that for not-fused specimens? 
Carry out a test at significance level .01. 


Toxaphene is an insecticide that has been identified as a 
pollutant in the Great Lakes ecosystem. To investigate the 
effect of toxaphene exposure on animals, groups of rats 
were given toxaphene in their diet. The article 
“Reproduction Study of Toxaphene in the Rat” (J. of 
Environ. Sci. Health, 1988: 101-126) reports weight gains 
(in grams) for rats given a low dose (4 ppm) and for control 
rats whose diet did not include the insecticide. The sample 
standard deviation for 23 female control rats was 32 g and 
for 20 female low-dose rats was 54 g. Does this data sug- 
gest that there is more variability in low-dose weight gains 
than in control weight gains? Assuming normality, carry 
out a test of hypotheses at significance level .05. 


The following observations are on time (h) for a AA 1.5- 
volt alkaline battery to reach a 0.8 voltage (“Comparing 


65. 


66. 
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the Lifetimes of Two Brands of Batteries,’ J. of 
Statistical Educ., 2013, online): 


Energizer: 8.65 8.74 8.91 8.72 8.85 
Ultracell: 8.76 8.81 8.81 8.70 8.73 
Energizer: 8.52 8.62 8.68 8.86 
Ultracell: 8.76 8.68 8.64 8.79 


Normal probability plots support the assumption that 
the population distributions are normal. Does the data 
suggest that the variance of the Energizer population 
distribution differs from that of the Ultracell population 
distribution? Test the relevant hypotheses using a sig- 
nificance level of .05. [Note: The two-sample f test for 
equality of population means gives a P-value of .763.] 
The Energizer batteries are much more expensive than 
the Ultracell batteries. Would you pay the extra money? 


The article “Enhancement of Compressive Properties 
of Failed Concrete Cylinders with Polymer 
Impregnation” (J. of Testing and Evaluation, 1977: 
333-337) reports the following data on impregnated 
compressive modulus (psi X 10°) when two different 
polymers were used to repair cracks in failed concrete. 


1.75 
Vo. 


2.12 
1.59 


2.05 
1.70 


1.97 
1.69 


Epoxy 
MMA prepolymer 


Obtain a 90% CI for the ratio of variances by first using 
the method suggested in the text to obtain a general con- 
fidence interval formula. 


Reconsider the data of Example 9.6, and calculate a 95% 
upper confidence bound for the ratio of the standard 
deviation of the triacetate porosity distribution to that of 
the cotton porosity distribution. 


SUPPLEMENTARY EXERCISES (67-95) 


67. 


The accompanying summary data on compression 
strength (Ib) for 12 X 10 X 8 in. boxes appeared in the 
article “Compression of Single-Wall Corrugated 
Shipping Containers Using Fixed and Floating Test 
Platens” (J. Testing and Evaluation, 1992: 318-320). 
The authors stated that “the difference between the 
compression strength using fixed and floating platen 
method was found to be small compared to normal 
variation in compression strength between identical 
boxes.” Do you agree? Is your analysis predicated on 
any assumptions? 


Sample Sample Sample 
Method Size Mean SD 
Fixed 10 807 27 
Floating 10 757 41 


68. 


The article “Supervised Exercise Versus Non- 
Supervised Exercise for Reducing Weight in Obese 
Adults” (The J. of Sports Med. and Physical Fitness, 
2009: 85-90) reported on an investigation in which par- 
ticipants were randomly assigned to either a supervised 
exercise program or a control group. Those in the control 
group were told only that they should take measures to 
lose weight. After 4 months, the sample mean decrease 
in body fat for the 17 individuals in the experimental 
group was 6.2 kg with a sample standard deviation of 
4.5 kg, whereas the sample mean and sample standard 
deviation for the 17 people in the control group were 
1.7 kg and 3.1 kg, respectively. Assume normality of the 
two weight-loss distributions (as did the investigators). 
a. Calculate a 99% lower prediction bound for the 
weight loss of a single randomly selected individual 
subjected to the supervised exercise program. Can 
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69. 


70. 


you be highly confident that such an individual will 
actually lose weight? 

b. Does it appear that true average decrease in body fat 
is more than two kg larger for the experimental con- 
dition than for the control condition? Use the accom- 
panying Minitab output to reach a conclusion at 
significance level of .01. [Note: Minitab accepts such 
summary data as well as individual observations. 
Also, because the test is upper-tailed, the software 
provides a lower confidence bound rather than a 
conventional CI.] 


Sample N Mean StDev SE Mean 
Exptl. 17 6.20 4.50 Lol 
Control 17 1570 30 0.75 
Difference = mu (1) — mu (2) 

Estimate for difference: 4.50 

95% lower bound for difference: 2.25 
T-Test of difference = 2 (vs >): 

T-Value = 1.89 

P-Value = 0.035 DF = 28 


Is the response rate for questionnaires affected by includ- 
ing some sort of incentive to respond along with the 
questionnaire? In one experiment, 110 questionnaires 
with no incentive resulted in 75 being returned, whereas 
98 questionnaires that included a chance to win a lottery 
yielded 66 responses (‘“‘Charities, No; Lotteries, No; 
Cash, Yes,” Public Opinion Quarterly, 1996: 542-562). 
Does this data suggest that including an incentive increa- 
ses the likelihood of a response? State and test the rele- 
vant hypotheses at significance level .10. 


Shoveling is not exactly a high-tech activity, but it will 
continue to be a required task even in our information 
age. The article “A Shovel with a Perforated Blade 
Reduces Energy Expenditure Required for Digging 
Wet Clay” (Human Factors, 2010: 492-502) reported 
on an experiment in which 13 workers were each pro- 
vided with both a conventional shovel and a shovel 
whose blade was perforated with small holes. The 
authors of the cited article provided the following data on 
stable energy expenditure [(kcal/kg(subject)/Ib(clay)]: 


Worker: 1 2 3 4 
Conventional: 0011 .0014 .0018 .0022 
Perforated: 0011 .0010 .0019 .0013 
Worker: 5 6 7 
Conventional .0010 .0016 .0028 
Perforated: 0011 .0017 .0024 

Worker: 8 9 10 
Conventional: .0020 .0015 .0014 
Perforated: .0020 .0013 .0013 

Worker: 11 12 13 
Conventional: .0023 .0017 .0020 
Perforated: .0017 .0015 .0013 


a. Calculate a confidence interval at the 95% confi- 
dence level for the true average difference between 
energy expenditure for the conventional shovel and 
the perforated shovel (the relevant normal 
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74. 


probability plot shows a reasonably linear pattern). 
Based on this interval, does it appear that the shovels 
differ with respect to true average energy expendi- 
ture? Explain. 

b. Carry out a test of hypotheses at significance level 
.O5 to see if true average energy expenditure using 
the conventional shovel exceeds that using the perfo- 
rated shovel. 


The article “Quantitative MRI and Electrophysiology 
of Preoperative Carpal Tunnel Syndrome in a Female 
Population” (Ergonomics, 1997: 642-649) reported 
that (—473.13, 1691.9) was a large-sample 95% confi- 
dence interval for the difference between true average 
thenar muscle volume (mm?) for sufferers of carpal tun- 
nel syndrome and true average volume for nonsufferers. 
Calculate and interpret a 90% confidence interval for this 
difference. 


The following summary data on bending strength (Ib-in/in) 
of joints is taken from the article “Bending Strength of 
Corner Joints Constructed with Injection Molded 
Splines” (Forest Products J., April, 1997: 89-92). 


Sample Sample Sample 
Type Size Mean SD 
Without side coating 10 80.95 9:59 
With side coating 10 63.23 5.96 


a. Calculate a 95% lower confidence bound for true 
average strength of joints with a side coating. 

b. Calculate a 95% lower prediction bound for the 
strength of a single joint with a side coating. 

ce. Calculate an interval that, with 95% confidence, 
includes the strength values for at least 95% of the 
population of all joints with side coatings. 

d. Calculate a 95% confidence interval for the differ- 
ence between true average strengths for the two 
types of joints. 


The article “Urban Battery Litter” cited in Example 
8.14 gave the following summary data on zinc mass (g) 
for two different brands of size D batteries: 


Brand Sample Size Sample Mean Sample SD 
Duracell 15 138.52 7.16 
Energizer 20 149.07 1.52 


Assuming that both zinc mass distributions are at least 
approximately normal, carry out a test at significance 
level .05 to decide whether true average zinc mass is dif- 
ferent for the two types of batteries. 


The derailment of a freight train due to the catastrophic 
failure of a traction motor armature bearing provided the 
impetus for a study reported in the article “Locomotive 
Traction Motor Armature Bearing Life Study” 
(Lubrication Engr., Aug. 1997: 12-19). A sample of 17 


75. 


76. 


high-mileage traction motors was selected, and the amount 
of cone penetration (mm/10) was determined both for the 
pinion bearing and for the commutator armature bearing, 
resulting in the following data: 


Motor 
1 2 3 4 5 6 
Commutator 211 273 = 305) S258 ~=—270)=S—_ 209 
Pinion 226 278 259 244 273 236 
Motor 
7. 8 9 10 11 12 
Commutator 223 288 296 233 262 291 
Pinion 290 287 315 242 288 242 
Motor 
13 14 15 16 17 
Commutator 278 275 210 272 264 
Pinion 278 208 281 274 268 


Calculate an estimate of the population mean difference 
between penetration for the commutator armature bear- 
ing and penetration for the pinion bearing, and do so in 
a way that conveys information about the reliability and 
precision of the estimate. [Note: A normal probability plot 
validates the necessary normality assumption.] Would you 
say that the population mean difference has been precisely 
estimated? Does it look as though population mean pen- 
etration differs for the two types of bearings? Explain. 


Headability is the ability of a cylindrical piece of mate- 
rial to be shaped into the head of a bolt, screw, or other 
cold-formed part without cracking. The article “New 
Methods for Assessing Cold Heading Quality” (Wire J. 
Intl., Oct. 1996: 66-72) described the result of a head- 
ability impact test applied to 30 specimens of aluminum 
killed steel and 30 specimens of silicon killed steel. The 
sample mean headability rating number for the steel 
specimens was 6.43, and the sample mean for aluminum 
specimens was 7.09. Suppose that the sample standard 
deviations were 1.08 and 1.19, respectively. Do you agree 
with the article’s authors that the difference in headability 
ratings is significant at the 5% level (assuming that the 
two headability distributions are normal)? 


The article “Fatigue Testing of Condoms” cited in 
Exercise 7.32 reported that for a sample of 20 natural 
latex condoms of a certain type, the sample mean and 
sample standard deviation of the number of cycles to 
break were 4358 and 2218, respectively, whereas a 
sample of 20 polyisoprene condoms gave a sample mean 
and sample standard deviation of 5805 and 3990, respec- 
tively. Is there strong evidence for concluding that true 
average number of cycles to break for the polyisoprene 
condom exceeds that for the natural latex condom by 
more than 1000 cycles? Carry out a test using a 


77. 


78. 


Site 
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significance level of .01. [Note: The cited paper reported 
P-values of t tests for comparing means of the various 
types considered. ] 


Information about hand posture and forces generated by 

the fingers during manipulation of various daily objects 

is needed for designing high-tech hand prosthetic 
devices. The article “Grip Posture and Forces During 

Holding Cylindrical Objects with Circular Grips” 

(Ergonomics, 1996: 1163-1176) reported that for a 

sample of 11 females, the sample mean four-finger pinch 

strength (N) was 98.1 and the sample standard deviation 
was 14.2. For a sample of 15 males, the sample mean and 

sample standard deviation were 129.2 and 39.1, 

respectively. 

a. A test carried out to see whether true average 
strengths for the two genders were different resulted 
in ¢ = 2.51 and P-value = .019. Does the appropri- 
ate test procedure described in this chapter yield this 
value of t and the stated P-value? 

b. Is there substantial evidence for concluding that true 
average strength for males exceeds that for females 
by more than 25 N? State and test the relevant 
hypotheses. 


The article “Pine Needles as Sensors of Atmospheric 
Pollution” (Environ. Monitoring, 1982: 273-286) 
reported on the use of neutron-activity analysis to deter- 
mine pollutant concentration in pine needles. According to 
the article’s authors, “These observations strongly indicat- 
ed that for those elements which are determined well by 
the analytical procedures, the distribution of concentration 
is lognormal. Accordingly, in tests of significance the loga- 
rithms of concentrations will be used.” The given data 
refers to bromine concentration in needles taken from a site 
near an oil-fired steam plant and from a relatively clean 
site. The summary values are means and standard devia- 
tions of the log-transformed observations. 


SD of Log 
Concentration 


Sample 
Size 


Mean Log 
Concentration 


Steam plant 8 
Clean 9 


18.0 4.9 
11.0 4.6 


79. 


Let wf be the true average Jog concentration at the first 

site, and define ¥ analogously for the second site. 

a. Use the pooled ¢ test (based on assuming normality and 
equal standard deviations) to decide at significance 
level .0S whether the two concentration distribution 
means are equal. 

b. If o# and o% (the standard deviations of the two log 
concentration distributions) are not equal, would p, 
and y2, (the means of the concentration distributions) 
be the same if uw = w¥? Explain your reasoning. 


The article “The Accuracy of Stated Energy Contents 
of Reduced-Energy, Commercially Prepared Foods” 
(J. of the Amer. Dietetic Assoc., 2010: 116-123) 
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presented the accompanying data on vendor-stated gross 
energy and measured value (both in kcal) for 10 different 
supermarket convenience meals): 


Meal: 1 2 4 5 6 7 8 9 10 
Stated: 180 220 190 230 200 370 250 240 80 180 
Measured: 212 319 231 306 211 431 288 265 145 228 


Carry out a test of hypotheses to decide whether the true 
average % difference from that stated differs from zero. 
[Note: The article stated “Although formal statistical meth- 
ods do not apply to convenience samples, standard statistical 
tests were employed to summarize the data for exploratory 
purposes and to suggest directions for future studies.”’] 


80. Arsenic is a known carcinogen and poison. The standard 
laboratory procedures for measuring arsenic concentra- 
tion (g/L) in water are expensive. Consider the accom- 
panying summary data and Minitab output for comparing 
a laboratory method to a new relatively quick and inex- 
pensive field method (from the article “Evaluation of a 
New Field Measurement Method for Arsenic in 
Drinking Water Samples,” J. of Envir. Engr., 2008: 
382-388). 


Two-Sample T-Test and CI 


Sample N Mean StDev SE Mean 
HE 3 19.70 1.10 0.64 
2 3 10.90 0.60 0.35 


Estimate for difference: 8.800 


95% CI for difference: (6.498, 11.102) 
T-Test of difference =0 (vs not =): 
T-Value =12.16 P-Value =0.001 DF =3 


What conclusion do you draw about the two methods, 
and why? Interpret the given confidence interval. [Note: 
One of the article’s authors indicated in private com- 
munication that they were unsure why the two methods 
disagreed. ] 


81. The accompanying data on response time appeared in the 
article “The Extinguishment of Fires Using Low-Flow 
Water Hose Streams—Part II’ (Fire Technology, 
1991; 291-320). 


Good visibility 
43 1.17 37 47 68 58 50 2.75 
Poor visibility 


147 .80 1.58 1.53 4.33 4.23 3.25 3.22 


The authors analyzed the data with the pooled rtest. Does the 
use of this test appear justified? [Hint: Check for normality. 
The z percentiles for n = 8 are —1.53, —.89, —.49, —.15, 
.15, 49, .89, and 1.53.] 


82. Acrylic bone cement is commonly used in total joint 
arthroplasty as a grout that allows for the smooth transfer of 
loads from a metal prosthesis to bone structure. The paper 
“Validation of the Small-Punch Test as a Technique for 
Characterizing the Mechanical Properties of Acrylic 
Bone Cement” (J. of Engr. in Med., 2006: 11-21) gave 
the following data on breaking force (N): 
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Temp Medium n x s 
22° Dry 6 170.60 39.08 
37° Dry 6 325.73 34.97 
22° Wet 6 366.36 34.82 
37° Wet 6 306.09 41.97 


Assume that all population distributions are normal. 

a. Estimate true average breaking force in a dry medium at 
37° in a way that conveys information about reliability 
and precision, and interpret your estimate. 

b. Estimate the difference between true average break- 
ing force in a dry medium at 37° and true average 
force at the same temperature in a wet medium, and 
do so in a way that conveys information about preci- 
sion and reliability. Then interpret your estimate. 

c. Is there strong evidence for concluding that true 
average force in a dry medium at the higher tempera- 
ture exceeds that at the lower temperature by more 
than 100 N? 


83. In an experiment to compare bearing strengths of pegs 
inserted in two different types of mounts, a sample of 14 
observations on stress limit for red oak mounts resulted in a 
sample mean and sample standard deviation of 8.48 MPa 
and .79 MPa, respectively, whereas a sample of 12 observa- 
tions when Douglas fir mounts were used gave a mean of 
9.36 and a standard deviation of 1.52 (‘‘Bearing Strength 
of White Oak Pegs in Red Oak and Douglas Fir 
Timbers,” J. of Testing and Evaluation, 1998, 109-114). 
Consider testing whether or not true average stress limits are 
identical for the two types of mounts. Compare df’s and 
P-values for the unpooled and pooled t tests. 


84. How does energy intake compare to energy expenditure? 
One aspect of this issue was considered in the article 
“Measurement of Total Energy Expenditure by the 
Doubly Labelled Water Method in Professional 
Soccer Players” (J. of Sports Sciences, 2002: 391-397), 
which contained the accompanying data (MJ/day). 


Player 


1 2 3 4 5 6 7 


12.1 14.3 14.2 
9.2 11.8 11.6 


15.2 15.5 17.8 
12.7 15.0 16.3 


Expenditure 14.4 
Intake 14.6 


Test to see whether there is a significant difference between 
intake and expenditure. Does the conclusion depend on 
whether a significance level of .05, .01, or .001 is used? 


85. An experimenter wishes to obtain a CI for the difference 
between true average breaking strength for cables manu- 
factured by company I and by company II. Suppose 
breaking strength is normally distributed for both types 
of cable with 7, = 30 psi and o = 20 psi. 

a. If costs dictate that the sample size for the type I 
cable should be three times the sample size for the 
type II cable, how many observations are required if 
the 99% CI is to be no wider than 20 psi? 


86. 
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OMANNDUNFWNH 


b. Suppose a total of 400 observations is to be made. 
How many of the observations should be made on 
type I cable samples if the width of the resulting 
interval is to be a minimum? 


A study was carried out to compare two different meth- 

ods, injection and nasal spray, for administering flu vac- 

cine to children under the age of 5. All 8000 children in 
the study were given both an injection and a spray. 

However, the vaccine given to 4000 of the children actu- 

ally contained just saltwater, and the spray given to the 

other 4000 children also contained just saltwater. At 
the end of the flu season, it was determined that 3.9% of 
the children who received the real vaccine via nasal spray 
contracted the flu, whereas 8.6% of the 4000 children 
receiving the real vaccine via injection contracted the flu. 

a. Why do you think each child received both an injec- 
tion and a spray? 

b. Does one method for delivering the vaccine appear to 
be superior to the other? Test the appropriate hypoth- 
eses. [Note: The study was described in the article 
“Spray Flu Vaccine May Work Better Than 
Injections for Tots,’ San Luis Obispo Tribune, 
May 2, 2006.] 


Wait staff at restaurants have employed various strategies 
to increase tips. An article in the Sept. 5, 2005, New 
Yorker reported that “In one study a waitress received 
50% more in tips when she introduced herself by name 
than when she didn’t.” Consider the following (fictitious) 
data on tip amount as a percentage of the bill: 

m=50 x=2263 s,=7.82 
n=50 y=1415  s,=6.10 
Does this data suggest that an introduction increases 
tips on average by more than 50%? State and test 
the relevant hypotheses. [Hint: Consider the parameter 
0 = pw, — 15py.] 

The paper “Quantitative Assessment of Glenohumeral 
Translation in Baseball Players” (The Amer. J. of Sports 
Med., 2004: 1711-1715) considered various aspects of 
shoulder motion for a sample of pitchers and another 
sample of position players [glenohumeral refers to the 
articulation between the humerus (ball) and the glenoid 
(socket)]. The authors kindly supplied the following data 
on anteroposterior translation (mm), a measure of the 
extent of anterior and posterior motion, both for the domi- 
nant arm and the nondominant arm. 


Introduction: 


No introduction: 


PosDom Tr PosND Tr _ Pit Dom Tr Pit ND Tr 
30.31 32.54 27.63 24.33 
44.86 40.95 30.57 26.36 
22.09 23.48 32.62 30.62 
31.26 31.11 39.79 33.74 
28.07 28.75 28.50 29.84 
31.93 29.32 26.70 26.71 
34.68 34.79 30.34 26.45 
29.10 28.87 28.69 21.49 
25.51 27.59 31.19 20.82 


89. 
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PosDom Tr PosNDTr_ PitDomTr Pit ND Tr 
22.49 21.01 36.00 21.75 
28.74 30.31 31.58 28.32 
27.89 27.92 32.55 27.22 
28.48 27.85 29.56 28.86 
25.60 24.95 28.64 28.58 
20.21 21.59 28.58 27.15 
33.77 32.48 31.99 29.46 
32.59 32.48 27.16 21.26 
32.60 31.61 
29.30 27.46 
29.4463 29.2137 30.7112 26.6447 

5.4655 4.7013 3.3310 3.6679 


a. Estimate the true average difference in translation 
between dominant and nondominant arms for pitch- 
ers in a way that conveys information about reliabil- 
ity and precision, and interpret the resulting estimate. 
Repeat (a) for position players. 

c. The authors asserted that “pitchers have greater differ- 
ence in side-to-side anteroposterior translation of their 
shoulders compared with position players.’ Do you 
agree? Explain. 

Suppose a level .05 test of Hy: uw, — w, =O versus 

H,: [ty — by > Vis to be performed, assuming a, = 05 = 

10 and normality of both distributions, using equal sample 

sizes (m = n). Evaluate the probability of a type II error 

when pL, — Mf, = 1 and n = 25, 100, 2500, and 10,000. 

Can you think of real problems in which the difference 

b; — By = [has little practical significance? Would sam- 

ple sizes of n = 10,000 be desirable in such problems? 


The invasive diatom species Didymosphenia geminata 
has the potential to inflict substantial ecological and 
economic damage in rivers. The article ‘Substrate 
Characteristics Affect Colonization by the Bloom- 
Forming Diatom Didymosphenia geminata” (Acquatic 
Ecology, 2010: 33-40) described an investigation of 
colonization behavior. One aspect of particular interest 
was whether the roughness of stones impacted the degree 
of colonization. The authors of the cited article kindly 
provided the accompanying data on roughness ratio 
(dimensionless) for specimens of sandstone and shale. 


Sandstone: 5.74 2.07 3.29 0.75 1.23 
2.95 1.58 1.83 1.61 1.12 
2.91 3.22 2.84 1.97 2.48 
3:45 2.17 0.77 1.44 3.79 
Shale: 56 .84 40 55 36 72. 


29 47 .66 48 28 
72 31 35 32 37 43 
.60 54 43 oil 


Normal probability plots of both samples show a rea- 
sonably linear pattern. Estimate the difference between 
true average roughness for sandstone and that for shale 
in a way that provides information about reliability 
and precision, and interpret your estimate. Does it 
appear that true average roughness differs for the two 
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91. 


92. 


types of rocks (a formal test of this was reported in 
the article)? [Note: The investigators concluded that 
more diatoms colonized the rougher surface than the 
smoother surface. ] 


Researchers sent 5000 resumes in response to job ads 
that appeared in the Boston Globe and Chicago Tribune. 
The resumes were identical except that 2500 of them had 
“white sounding” first names, such as Brett and Emily, 
whereas the other 2500 had “black sounding” names 
such as Tamika and Rasheed. The resumes of the first 
type elicited 250 responses and the resumes of the sec- 
ond type only 167 responses (these numbers are very 
consistent with information that appeared in a Jan. 15, 
2003, report by the Associated Press). Does this data 
strongly suggest that a resume with a “black” name is 
less likely to result in a response than is a resume with a 
“white” name? 


McNematr’s test, developed in Exercise 56, can also be 
used when individuals are paired (matched) to yield n 
pairs and then one member of each pair is given treat- 
ment | and the other is given treatment 2. Then X, is 
the number of pairs in which both treatments were 
successful, and similarly for X,, X3, and X,. The test 
statistic for testing equal efficacy of the two treatments 
is given by (X, — X,)/V(X, + X,), which has approxi- 
mately a standard normal distribution when H), is true. 
Use this to test whether the drug ergotamine is effec- 
tive in the treatment of migraine headaches. 


Ergotamine 
S F 
S 44 34 
Placebo F 46 30 


The data is fictitious, but the conclusion agrees with that 
in the article “Controlled Clinical Trial of Ergotamine 
Tartrate” (British Med. J., 1970: 325-327). 


BIBLIOGRAPHY 


See the bibliography at the end of Chapter 7. 
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93. 


94. 


95. 


The article “Evaluating Variability in Filling 
Operations” (Food Tech., 1984: 51-55) describes two 
different filling operations used in a ground-beef packing 
plant. Both filling operations were set to fill packages 
with 1400 g of ground beef. In a random sample of size 

30 taken from each filling operation, the resulting means 

and standard deviations were 1402.24 g and 10.97 g for 

operation | and 1419.63 g and 9.96 g for operation 2. 

a. Using a .05 significance level, is there sufficient 
evidence to indicate that the true mean weight of the 
packages differs for the two operations? 

b. Does the data from operation | suggest that the true 
mean weight of packages produced by operation | is 
higher than 1400 g? Use a .05 significance level. 


Let X,,..., X,, be arandom sample from a Poisson distribu- 
tion with parameter j,, and let Y,,..., Y, be a random 
sample from another Poisson distribution with parameter 
4. We wish to test Hp: w; — b@, = 0 against one of the 
three standard alternatives. When m and n are large, the 
large-sample z test of Section 9.1 can be used. However, the 
fact that V(X) = w./n suggests that a different denominator 
should be used in standardizing X — Y. Develop a large- 
sample test procedure appropriate to this problem, and then 
apply it to the following data to test whether the plant densi- 
ties for a particular species are equal in two different 
regions (where each observation is the number of plants 
found in a randomly located square sampling quadrate hav- 
ing area | m2, so for region | there were 40 quadrates in 
which one plant was observed, etc.): 


Frequency 
0 1 2 3 4 5 6 7 
Region! 28 40 28 17 8 2 1 1 m=125 
Region2 14 25 30 18 49 2 1 1 n= 140 


Referring to Exercise 94, develop a large-sample confi- 
dence interval formula for 4, — j2,. Calculate the interval 
for the data given there using a confidence level of 95%. 


The Analysis of Variance 


INTRODUCTION 


In studying methods for the analysis of quantitative data, we first focused on 
problems involving a single sample of numbers and then turned to a comparative 
analysis of two such different samples. In one-sample problems, the data con- 
sisted of observations on or responses from individuals or experimental objects 
randomly selected from a single population. In two-sample problems, either the 
two samples were drawn from two different populations and the parameters of 
interest were the population means, or else two different treatments were applied 
to experimental units (individuals or objects) selected from a single population; in 
this latter case, the parameters of interest are referred to as true treatment means. 

The analysis of variance, or more briefly ANOVA, refers broadly to a 
collection of experimental situations and statistical procedures for the analysis 
of quantitative responses from experimental units. The simplest ANOVA prob- 
lem is referred to variously as a single-factor, single-classification, or one- 
way ANOVA. It involves the analysis either of data sampled from more than 
two numerical populations (distributions) or of data from experiments in which 
more than two treatments have been used. The characteristic that differenti- 
ates the treatments or populations from one another is called the factor under 
study, and the different treatments or populations are referred to as the levels 
of the factor. Examples of such situations include the following: 


1. An experiment to study the effects of five different brands of gasoline on 
automobile engine operating efficiency (mpg) 


2. An experiment to study the effects of the presence of four different sugar 
solutions (glucose, sucrose, fructose, and a mixture of the three) on bacterial 
growth 
409 
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3. An experiment to investigate whether hardwood concentration in pulp (%) 
at three different levels impacts tensile strength of bags made from the pulp 


4. An experiment to decide whether the color density of fabric specimens 
depends on which of four different dye amounts is used 


In (1) the factor of interest is gasoline brand, and there are five different 
levels of the factor. In (2) the factor is sugar, with four levels (or five, if a control 
solution containing no sugar is used). In both (1) and (2), the factor is qualita- 
tive in nature, and the levels correspond to possible categories of the factor. 
In (3) and (4), the factors are concentration of hardwood and amount of dye, 
respectively; both these factors are quantitative in nature, so the levels identify 
different settings of the factor. When the factor of interest is quantitative, sta- 
tistical techniques from regression analysis (discussed in Chapters 12 and 13) 
can also be used to analyze the data. 

This chapter focuses on single-factor ANOVA. Section 10.1 presents the 
F test for testing the null hypothesis that the population or treatment means 
are identical. Section 10.2 considers further analysis of the data when Hy has 
been rejected. Section 10.3 covers some other aspects of single-factor ANOVA. 
Chapter 11 introduces ANOVA experiments involving more than a single factor. 


10.1 Single-Factor ANOVA 


Single-factor ANOVA focuses on a comparison of more than two population or 
treatment means. Let 
I = the number of populations or treatments being compared 


4, = the mean of population | or the true average response when treatment | is 
applied 


4, = the mean of population / or the true average response when treatment J is 
applied 
The relevant hypotheses are 


Ao: by = by = 0 = My 
versus 


H,: at least two the of the ju,’s are different 


If J = 4, H, is true only if all four j2,’s are identical. H, would be true, for example, if 
My, = by # MM; = Wy if w, = Wz = by ~ |, Or if all four y,’s differ from one another. 

A test of these hypotheses requires that we have available a random sample 
from each population or treatment. 


EXAMPLE 10.1 The article “Compression of Single-Wall Corrugated Shipping Containers Using 
Fixed and Floating Test Platens” (J. Testing and Evaluation, 1992: 318-320) 
describes an experiment in which several different types of boxes were compared with 
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10.1 Single-Factor ANOVA 411 
respect to compression strength (Ib). Table 10.1 presents the results of a single-factor 


ANOVA experiment involving J = 4 types of boxes (the sample means and standard 
deviations are in good agreement with values given in the article). 


Table 10.1 The Data and Summary Quantities for Example 10.1 


Type of Box Compression Strength (Ib) Sample Mean Sample SD 
1 655.5 788.3 734.3 721.4 679.1 699.4 713.00 46.55 
2 789.2 772.5 786.9 686.1 732.1 774.8 756.93 40.34 
3 737.1 639.0 696.3 671.7 717.2 727.1 698.07 37.20 
4 535.1 628.7 542.4 559.0 586.9 520.0 562.02 39.87 
Grand mean = 682.50 


With yw; denoting the true average compression strength for boxes of type i (i = 1, 2, 
3, 4), the null hypothesis is Hp: “w; = b@, = MW; = My. Figure 10.1(a) shows a compara- 
tive boxplot for the four samples. There is a substantial amount of overlap among 
observations on the first three types of boxes, but compression strengths for the fourth 
type appear considerably smaller than for the other types. This suggests that H is 
not true. The comparative boxplot in Figure 10.1(b) is based on adding 120 to each 
observation in the fourth sample (giving mean 682.02 and the same standard devia- 
tion) and leaving the other observations unaltered. It is no longer obvious whether H, 
is true or false. In situations such as this, we need a formal test procedure. 


: I 


(b) 


Figure 10.1 Boxplots for Example 10.1: (a) original data; (b) altered data |_| 
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Notation and Assumptions 


The letters X and Y were used in two-sample problems to differentiate the observa- 
tions in one sample from those in the other. Because this is cumbersome for three 
or more samples, it is customary to use a single letter with two subscripts. The first 
subscript identifies the sample number, corresponding to the population or treatment 
being sampled, and the second subscript denotes the position of the observation 
within that sample. Let 


X; ; = the random variable (rv) that denotes the jth measurement taken from 
the ith population, or the measurement taken on the jth experimental 
unit that receives the ith treatment 


x;; = the observed value of X; ; when the experiment is performed 


The observed data is usually displayed in a rectangular table, such as Table 10.1. 
There samples from the different populations appear in different rows of the table, 
and x; ; 1s the jth number in the ith row. For example, x, , = 786.9 (the third observa- 
tion from the second population), and x,, = 535.1. When there is no ambiguity, we 
will write Xi rather than xij (e.g., 1f there were 15 observations on each of 12 treat- 
ments, x,,, could mean x, ;, Or X,,). Itis assumed that the X,’s within any particular 
sample are independent—a random sample from the ith population or treatment 
distribution—and that different samples are independent of one another. 

In some experiments, different samples contain different numbers of obser- 
vations. Here we’ll focus on the case of equal sample sizes; the generalization to 
unequal sample sizes appears in Section 10.3. Let J denote the number of observa- 
tions in each sample (J = 6 in Example 10.1). The data set consists of IJ observa- 
tions. The individual sample means will be denoted by X.5 Noes sedis Xn That is, 


The dot in place of the second subscript signifies that we have added over all values 
of that subscript while holding the other subscript value fixed, and the horizontal bar 
indicates division by J to obtain an average. Similarly, the average of all J observa- 
tions, called the grand mean, is 


For the data in Table 10.1, x,. = 713.00, x,. = 756.93, x3. = 698.07, x4. = 562.02, 
and x.. = 682.50. Additionally, let S7, 53,..., 57, denote the sample variances: 


J 


> & - XP 
gal i=1,2,...,1 
t=1 


From Example 10.1, s, = 46.55, s} = 2166.90, and so on. 


ASSUMPTIONS The / population or treatment distributions are all normal with the same vari- 
ance o”. That is, each Xi is normally distributed with 


E(X;) = Bb; V(X;;) =o 
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The J sample standard deviations will generally differ somewhat even when 
the corresponding o’s are identical. In Example 10.1, the largest among s,, 54, 53, and 
5, is about 1.25 times the smallest. A rough rule of thumb is that if the largest s is 
not much more than two times the smallest, it is reasonable to assume equal o?’s. 

In previous chapters, a normal probability plot was suggested for checking 
normality. The individual sample sizes in ANOVA are often too small for J sep- 
arate plots to be informative. A single plot can be constructed by subtracting x), 
from each observation in the first sample, x,. from each observation in the second, 
and so on, and then plotting these // deviations against the z percentiles. Figure 10.2 
gives such a plot for the data of Example 10.1. The straightness of the pattern gives 
strong support to the normality assumption. 


Deviation 


z percentile 


-14 -7 0 i 1.4 


Figure 10.2 A normal probability plot based on the data of Example 10.1 


If either the normality assumption or the assumption of equal variances is judged 
implausible, a method of analysis other than the usual F test must be employed. Please 
seek expert advice in such situations (one possibility, a data transformation, is sug- 
gested in Section 10.3, and another alternative is developed in Section 15.4). 


The Test Statistic 


If Hy is true, the J observations in each sample come from a normal population 
distribution with the same mean value p, in which case the sample means X,.,..., x, 
should be reasonably close to one another. The test procedure is based on comparing 
a measure of differences among the x,.’s (“between-samples” variation) to a measure 
of variation calculated from within each of the samples. 


DEFINITION Mean square for treatments is given by 


MSTr = oa [(&%,. —- X22 + &, — K.P + + &, — XI 


_/ Se oxy 
ia X..) 


and mean square for error is 
Spor Sa a oy 
if 
The test statistic for single-factor ANOVA is F = MSTr/MSE. 


MSE = 
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The terminology “mean square” will be explained shortly. Notice that uppercase X’s 
and S$2’s are used, so MSTr and MSE are defined as statistics. We will follow tradi- 
tion and also use MSTr and MSE (rather than mstr and mse) to denote the calculated 
values of these statistics. Each S? assesses variation within a particular sample, so 
MSE is a measure of within-samples variation. 

What kind of value of F provides evidence for or against H,? If Hp is true (all 
;’S are equal), the values of the individual sample means should be close to one 
another and therefore close to the grand mean, resulting in a relatively small value of 
MSTr. However, if the jz,’s are quite different, some x,.’s should differ quite a bit from 
x... So the value of MSTr is affected by the status of H, (true or false). This is not the 
case with MSE, because the s?’s depend only on the underlying value of o? and not 
on where the various distributions are centered. The following box presents an impor- 
tant property of E(MSTr) and E(MSE), the expected values of these two statistics. 


PROPOSITION When H) is true, 

E(MSTr) = E(MSE) = o? 
whereas when H, is false, 

E(MSTr) > E(MSE) = o? 


That is, both statistics are unbiased for estimating the common population vari- 
ance a” when H, is true, but MSTr tends to overestimate a? when H) is false. 


The unbiasedness of MSE is a consequence of E(S?) = o? whether H, is true or 
false. When H, is true, each X,, has the same mean value py and variance o?/J, so 
LX, — X..)2/U — 1), the “sample variance” of the Xs estimates o?/J unbiasedly; 
multiplying this by J gives MSTr as an unbiased estimator of o? itself. The X,’s tend 
to spread out more when H) is false than when it is true, tending to inflate the value 
of MSTr in this case. Thus a value of F that greatly exceeds 1, corresponding to an 
MSTr much larger than MSE, casts considerable doubt on H,. Determination of the 
P-value requires that the distribution of F when H) is true be known. 


F Distributions and the F Test 


In Section 9.5, we introduced a family of probability distributions called F distribu- 
tions. An F distribution arises in connection with a ratio in which there is one num- 
ber of degrees of freedom (df) associated with the numerator and another number 
of degrees of freedom associated with the denominator. Let v, and v, denote the 
number of numerator and denominator degrees of freedom, respectively, for a vari- 
able with an F distribution. Both v, and v, are positive integers. Figure 10.3 pictures 


F density curve 
for v, and v df 


Shaded area = @ 


| 

i] 

f 
QV]. 


Figure 10.3 An Fdensity curve and critical value F, 


P52 
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an F density curve and the corresponding upper-tail critical value F,, ,, ,. Appendix 
Table A.9 gives these critical values for a = .10, .05, .01, and .001. Values of Vy, are 
identified with different columns of the table, and the rows are labeled with various 
values of v,. For example, the F critical value that captures upper-tail area .05 under 
the F curve with v, = 4 and v, = 6 is Fos 4.6 = 4.53, whereas F956, = 6.16. The 


key theoretical result is that the test statistic F has an F distribution when A) is true. 


THEOREM Let F = MSTr/MSE be the test statistic in a single-factor ANOVA problem 
involving J populations or treatments with a random sample of J observations 
from each one. When H) is true and the basic assumptions of this section are 
satisfied, F has an F distribution with v, = J — 1 and v, = IJ — 1). Because 
a larger fis more contradictory to H, than a smaller f, the test is upper-tailed: 


P-value = P(F = f when H) is true) 
= area under the F, _ , , — ) curve to the right of f 
Statistical software will provide an exact P-value. Refer to Section 9.5 for a 


description of how our book’s table of F critical values, Table A.9, can be used 
to obtain an upper or lower bound (or both) on the P-value. 


The rationale for v, = J — 1 is that although MSTr is based on the J deviations 
X,. — X..,..., X), — X.., U(X, — X..) = 0, so only J — | of these are freely determined. 
Because each sample contributes J — 1 df to MSE and these samples are independ- 
ent, vy, =JV-D+--+0-)D=l7- 1). 


EXAMPLE 10.2 The values of J and J for the strength data are 4 and 6, respectively, so numera- 
(Example 10.1 tor df =7—1=3 and denominator df = (J — 1) = 20. The grand mean is 
continued) x.= 22x ;/ (IJ) = 682.50, 


MSTr = [(713.00 — 682.50) + (756.93 — 682.50)? 


4-1 
+ (698.07 — 682.50)* + (562.02 — 682.50)] = 42,455.86 


1 
MSE = 7 [(46.55)° + (40.34)? + (37.20)? + (39.87)"] = 1691.92 
f = MSTr/MSE = 42,455.86/1691.92 = 25.09 


The largest F critical value in Table A.9 for v, = 3, v, = 20 is Fy; 3.9 = 8.10. Since 
f = 25.09 > 8.10, the area under the F; 5) curve to the right of 25.09 is smaller than 
001. Therefore P-value = .05, so the null hypothesis Hp: w, = b@, = Mz = py iS 
resoundingly rejected at significance level .05. True average compression strength 
does appear to depend on box type. In fact, because the P-value is so small H, would 
be rejected at any reasonable significance level. a 


EXAMPLE 10.3 The article “Influence of Contamination and Cleaning on Bond Strength to 
Modified Zirconia” (Dental Materials, 2009: 1541-1550) reported on an experi- 
ment in which 50 zirconium-oxide disks were divided into five groups of 10 each. 
Then a different contamination/cleaning protocol was used for each group. The fol- 
lowing summary data on shear bond strength (MPa) appeared in the article: 


Treatment: 1 2 3 4 5 
Sample mean 10.5 14.8 15.7. 16.0 21.6 Grand mean = 15.7 
Sample sd 4.5 6.8 6.5 6.7 6.0 
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The authors of the cited article used the F test, so hopefully examined a normal prob- 
ability plot of the deviations (or a separate plot for each sample, since each sample 
size is 10) to check the plausibility of assuming normal treatment-response distribu- 
tions. The five sample standard deviations are certainly close enough to one another 
to support the assumption of equal o’s. 


1. ys; = true average bond strength for protocol i (i = 1, 2, 3, 4, 5) 


2. Ho: by = bo = M3 = by = Ms (true average bond strength does not depend on 
which protocol is used) 

3. H,: at least two of the y,’s are different 

4. The test statistic value is f = MSTr/MSE 

5. Numerator and denominator dfs are 7 — 1 = 4 and (J — 1) = 5(9) = 45. The 
mean squares are 


10 
MSTr = =; [(10.5 — 15.7)? + (14.8 — 15.77 + 5.7 - 15.77 


+ (16.0 — 15.7)? + (21.6 — 15.7)7] 
= 156.875 
MSE = [(4.5)? + (6.8)? + (6.5)? + (6.7)7 + (6.0)7]/5 = 37.926 


Thus the test statistic value is f = 156.875/37.926 = 4.14. 


6. Table A.9 gives Fy; 449 = 3.83, Foi aso = 3.72, Footaao = 5-70, and Fy91.4.50 = 
5.46. Therefore Fy) 445 ~ 3.77 and Foo, 445 ~ 5.56. Because f = 4.14 falls 
between these latter two critical values, the area under the F',,; curve to the right 
of 4.14 (i.e., the P-value) is between .001 and .01 (software yields .0061). 


7. Since P-value < .01, the null hypothesis should be rejected at this significance 
level. True average bond strength does appear to depend on which protocol is 
used. (2 


When the null hypothesis is rejected by the F test, as happened in both Exam- 
ples 10.2 and 10.3, the experimenter will often be interested in further analysis of 
the data to decide which y,’s differ from which others. Methods for doing this are 
called multiple comparison procedures; that is the topic of Section 10.2. The article 
cited in Example 10.3 summarizes the results of such an analysis. 


Sums of Squares 


The introduction of sums of squares facilitates developing an intuitive appreciation for 
the rationale underlying single-factor and multifactor ANOVAs. Let x, represent the 
sum (not the average, since there is no bar) of the x iS for i fixed (sum of the numbers 
in the ith row of the table) and x.. denote the sum of all the x i (the grand total). 


DEFINITION The total sum of squares (SST), treatment sum of squares (SSTr), and 
error sum of squares (SSE) are given by 


gerd poe 
SST= 5 DG, —s'= So =x! 


te rs 
Lae il I il 

SSTr = } SG, — %.)2 = =x? - 3x2 
=i Sy IJ 

Ti oi 


rT J ui 
SSE = >» DG, —x,)’ where x,, = Dae a De Doe 
j=l 


= (=1y=1 
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The sum of squares SSTr appears in the numerator of F, and SSE appears in the 
denominator of F; the reason for defining SST will be apparent shortly. 

The expressions on the far right-hand side of SST and SSTr are convenient if 
ANOVA calculations will be done by hand, although the wide availability of statisti- 
cal software makes this unnecessary. Both SST and SSTr involve x2/(L/) (the square 
of the grand total divided by //), which is usually called the correction factor for 
the mean (CP). After the correction factor is computed, SST is obtained by squaring 
each number in the data table, adding these squares together, and subtracting the cor- 
rection factor. SSTr results from squaring each row total, summing them, dividing 
by J, and subtracting the correction factor. SSE is then easily obtained by virtue of 
the following relationship. 


Fundamental Identity 


SST = SSTr + SSE (10.1) 


Thus if any two of the sums of squares are computed, the third can be obtained 
through (10.1); SST and SSTr are easiest to compute, and then SSE = SST — SSTr. 
The proof follows from squaring both sides of the relationship 


Xy— X.. = Gy — X;.) + &;. — x.) (10.2) 


and summing over all i and j. This gives SST on the left and SSTr and SSE as the 
two extreme terms on the right. The cross-product term is easily seen to be zero. 

The interpretation of the fundamental identity is an important aid to an under- 
standing of ANOVA. SST is a measure of the total variation in the data—the sum of 
all squared deviations about the grand mean. The identity says that this total varia- 
tion can be partitioned into two pieces. SSE measures variation that would be present 
(within rows) whether A) is true or false, and is thus the part of total variation that is 
unexplained by the status of Hp. SSTr is the amount of variation (between rows) that 
can be explained by possible differences in the pu,’s. Hp is rejected if the explained 
variation is large relative to unexplained variation. 

Once SSTr and SSE are computed, each is divided by its associated df to obtain a 
mean square (mean in the sense of average). Then F is the ratio of the two mean squares. 


T iB MST: 
SSTr MSE = SS F= STr 


MSTr = 
‘eal MGi= 1) MSE 


(10.3) 


The computations are often summarized in a tabular format, called an ANOVA 
table, as displayed in Table 10.2. Tables produced by statistical software customarily 
include a P-value column to the right of f. 


Table 10.2 An ANOVA Table 


Source of Sum of 

Variation df Squares Mean Square f 
Treatments ge | SSTr MSTr = SSTr/( — 1) MSTr/MSE 
Error IJ — 1) SSE MSE = SSE/U(VJ — 1)] 

Total IJ-1 SST 
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EXAMPLE 10.4 According to the article “Evaluating Fracture Behavior of Brittle Polymeric 
Materials Using an IASCB Specimen” (J. of Engr. Manuf., 2013: 133-140), 
researchers have recently proposed an improved test for the investigation of frac- 
ture toughness of brittle polymeric materials. This new fracture test was applied to 
the brittle polymer polymethylmethacrylate (PMMA), more popularly known as 
Plexiglas, which is widely used in commercial products. The test was performed by 
applying asymmetric three-point bending loads on PMMA specimens. The location 
of one of the three loading points was then varied to determine its effect on fracture 
load. In one experiment, three loading point locations based on different distances 
from the center of the specimen’s base were selected, resulting in the following 
fracture load data (kN): 


Me 
42 mm: 2.62 2.99 3.39 2.86 11.86 

Distance 36 mm: 3.47 3.85 3.77 3.63 14.72 
31.2 mm: 4.78 4.41 4.91 5.06 (19.16 

x.. = 45.74 


Let ps; denote true average fracture load when distance i is used (i = 1, 2, 3). The 
null hypothesis asserts that these three p1,;’s are identical, whereas the alternative 
hypothesis says that not all the y1,’s are the same. Before using the F test at signifi- 
cance level .01, we should check the plausibility of underlying assumptions. The 
three sample standard deviations are .322, .167, and .278, respectively. Sure enough, 
the largest of these three is no more than twice the smallest. So the assumption of 
equal variances is plausible. Figure 10.4 shows a normal probability plot of the 12 
residuals obtained by subtracting the mean of each sample from the four sample 
observations. They don’t come much straighter than this! It is reasonable to assume 
that the three fracture load distributions are normal. 


99 


95 
90 


Percent 
an 
So 


—0.50 —0.25 0.00 0.25 0.50 
Residual 


Figure 10.4 Normal probability plot of the residuals from Example 10.4 


Squaring each of the 12 observations and adding gives LUX; = (2.62)? +--+ 


(5.06)? = 181.7376. The values of the three sums of squares are 
SST = 181.7376 — (45.74)?/12 = 181.7376 — 174.3456 = 7.3920 
SSTr = 4111.86) + (14.72) + (19.16)?] — 174.3456 = 6.7653 
SSE = 7.3920 — 6.7653 = .6267 
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The accompanying ANOVA table from Minitab summarizes the computations. 
With a P-value of .000, the null hypothesis can be rejected at any sensible signifi- 
cance level, and in particular at the chosen level .01. There is compelling evidence 
for concluding that true average fracture load is not the same for all three distances. 


Source DF SS MS F P 

Dist 2 6.7653 3.3826 48.58 0.000 

Error 9 0.6267 0.0696 

Total 11 7.3920 iy 


EXERCISES Section 10.1 (1-10) 


In an experiment to compare the tensile strengths of 
I =5 different types of copper wire, J = 4 samples of 
each type were used. The between-samples and within- 
samples estimates of o? were computed as MSTr = 
2673.3 and MSE = 1094.2, respectively. Use the F test 
at level .05 to test Hp: w, = By = M3 = by = Ms versus 
H,: at least two p,,’s are unequal. 


Suppose that the compression-strength observations on 
the fourth type of box in Example 10.1 had been 655.1, 
748.7, 662.4, 679.0, 706.9, and 640.0 (obtained by add- 
ing 120 to each previous x,;). Assuming no change in the 
remaining observations, carry out an F test with a = .05. 


The lumen output was determined for each of J = 3 dif- 
ferent brands of lightbulbs having the same wattage, with 
J = 8 bulbs of each brand tested. The sums of squares 
were computed as SSE = 4773.3 and SSTr = 591.2. 
State the hypotheses of interest (including word defini- 
tions of parameters), and use the F test of ANOVA 
(a = .05) to decide whether there are any differences in 
true average lumen outputs among the three brands for 
this type of bulb by obtaining as much information as 
possible about the P-value. 


It is common practice in many countries to destroy 
(shred) refrigerators at the end of their useful lives. In 
this process material from insulating foam may be 
released into the atmosphere. The article ‘Release of 
Fluorocarbons from Insulation Foam in Home 
Appliances During Shredding” (J. of the Air and 
Waste Mgmt. Assoc., 2007: 1452-1460) gave the follow- 
ing data on foam density (g/L) for each of two refrigera- 
tors produced by four different manufacturers: 


1. 30.4, 29.2 221 240 
3. 27.1, 24.8 4. 25.5, 28.8 


Does it appear that true average foam density is not the 
same for all these manufacturers? Carry out an appropriate 
test of hypotheses by obtaining as much P-value infor- 
mation as possible, and summarize your analysis in an 
ANOVA table. 


Consider the following summary data on the modulus of 
elasticity (X 10° psi) for lumber of three different grades 
[in close agreement with values in the article “Bending 


Strength and Stiffness of Second-Growth Douglas-Fir 
Dimension Lumber” (Forest Products J., 1991: 35-43), 
except that the sample sizes there were larger]: 


Grade J Xj. 8; 
1 10 1.63 27 
2 10 1.56 24 
3 10 1.42 26 


Use this data and a significance level of .01 to test the 
null hypothesis of no difference in mean modulus of 
elasticity for the three grades. 


The article “Origin of Precambrian Iron Formations” 
(Econ. Geology, 1964: 1025-1057) reports the follow- 
ing data on total Fe for four types of iron formation 
(1=carbonate, 2 = silicate, 3 = magnetite, 4 = hematite). 


if 20.5 28.1 27.8 27.0 28.0 
2552 25:3 21.1 20.5 31.3 
2: 26.3 24.0 26.2 20.2 23.7 
34.0 17.1 26.8 23.7 24.9 
3: 29.5 34.0 2135 29.4 27.9 
26.2 29.9 29.5 30.0 35.6 
4: 36.5 44.2 34.1 30.3 31.4 


33.1 34.1 32.9 36.3 23:3 


Carry out an analysis of variance F test at significance 
level .01, and summarize the results in an ANOVA table. 


An experiment was carried out to compare electrical 
resistivity for six different low-permeability concrete 
bridge deck mixtures. There were 26 measurements on 
concrete cylinders for each mixture; these were obtained 
28 days after casting. The entries in the accompanying 
ANOVA table are based on information in the article 
“In-Place Resistivity of Bridge Deck Concrete 
Mixtures” (ACI Materials J., 2009: 114-122). Fill in 
the remaining entries and test appropriate hypotheses. 


Sum of 
Source df Squares Mean Square f 
Mixture 
13.929 
5664.415 
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8. A study of the properties of metal plate-connected 


trusses used for roof support (“Modeling Joints Made 
with Light-Gauge Metal Connector Plates,’ Forest 
Products J., 1979: 39-44) yielded the following obser- 
vations on axial-stiffness index (kips/in.) for plate lengths 
4, 6, 8, 10, and 12 in: 


Wheat 5.2 45 60 61 67 5.8 
Barley 65 8.0 61 75 5.9 5.6 
Maize 5.8 47 64 49 60 5.2 
Oats 8.3 61 78 7.0 5.5 7.2 


Does this data suggest that at least two of the grains dif- 
fer with respect to true average thiamin content? Use a 


4: 309.2 409.5 311.0 326.5 316.8 349.8 309.7 level a = .05 test. 
6: 402.1 347.2 361.0 404.5 331.0 348.9 381.7 : : 
8: 392.4 366.2 351.0 357.1 409.9 367.3 382.0 10. In single-factor ANOVA with / treatments and J observa- 


10: 346.7 452.9 461.4 433.1 410.6 384.2 362.6 
12: 407.4 441.8 419.9 410.7 473.4 441.2 465.8 


Does variation in plate length have any effect on 
true average axial stiffness? State and test the rel- 
evant hypotheses using analysis of variance with 
a= .01. Display your results in an ANOVA table. 
[Hint: 22x; = 5,241,420.79.] 


Six samples of each of four types of cereal grain grown 


tions per treatment, let w = (1/D>p,. 

a. Express E(X..) in terms of jw. [Hint: X.. = (1/D=X,] 

b. Determine E(X?). [Hint: For any rv Y, E(¥) = 
VY) + [EYP] 

c. Determine E(X2). 

d. Determine E(SSTr) and then show that 


d 
E(MSTr) = 0? + Si (us — w? 


e. Using the result of part (d), what is E(MSTr) when 


in a certain region were analyzed to determine thiamin 


ar ; H, is true? When H, is false, how does E(MSTr) 
content, resulting in the following data (g/g): 


compare to a7? 


10.2 Multiple Comparisons in ANOVA 


When the computed value of the F statistic in single-factor ANOVA is not signifi- 
cant, the analysis is terminated because no differences among the p's have been 
identified. But when H) is rejected, the investigator will usually want to know which 
of the y,’s are different from one another. A method for carrying out this further 
analysis is called a multiple comparisons procedure. 

Several of the most frequently used procedures are based on the following cen- 
tral idea. First calculate a confidence interval for each pairwise difference w; — p,; 
with i < j. Thus if J = 4, the six required CIs would be for 4; — p2, (but not also for 
Py — My)» Bey — Mes My — Magy Mey — begs My — Mg, and wz — fy. Then if the interval 
for 4, — f@, does not include 0, conclude that , and pu, differ significantly from 
one another; if the interval does include 0, the two w’s are judged not significantly 
different. Following the same line of reasoning for each of the other intervals, we end 
up being able to judge for each pair of j’s whether or not they differ significantly 
from one another. 

The procedures based on this idea differ in how the various Cls are calculated. 
Here we present a popular method that controls the simultaneous confidence level 
for all 1/7 — 1)/2 intervals. 


Tukey's Procedure (the T Method) 


Tukey’s procedure involves the use of another probability distribution called the 
Studentized range distribution. The distribution depends on two parameters: a 
numerator df m and a denominator df v. Let Q,,,,,, denote the upper-tail @ critical 
value of the Studentized range distribution with m numerator df and v denominator 
df (analogous to F, ). Values of Q. are given in Appendix Table A.10. 


Q,V}.Vy amv 
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PROPOSITION With probability 1 — a, 


Raa X,. — Qenig—1y \V MSE/ = bw; ~ Bj 
=X, — X, +O, 4g) V MSE G04) 
for every i and j (i = 1,..., 2 and j = 1,..., 1) with i<j. 


Notice that numerator df for the appropriate Q, critical value is J, the number of pop- 
ulation or treatment means being compared, and not J — | as in the F test. When the 
computed ;,., x i. and MSE are substituted into (10.4), the result is a collection of con- 
fidence intervals with simultaneous confidence level 100(1 — a@)% for all pairwise 
differences of the form ; — s; with i < j. Each interval that does not include 0 yields 
the conclusion that the corresponding values of 1; and py, differ significantly from one 
another. 

Since we are not really interested in the lower and upper limits of the various 
intervals but only in which include 0 and which do not, much of the arithmetic asso- 
ciated with (10.4) can be avoided. The following box gives details and describes how 
differences can be identified visually using an “underscoring pattern.” 


The T Method for Identifying Significantly Different j.,s 


Select a, extract Q,7,)-1) ftom Appendix Table A.10, and calculate w = 
Q..110-1° V MSE/J. Then list the sample means in increasing order and 
underline those pairs that differ by less than w. Any pair of sample means not 
underscored by the same line corresponds to a pair of population or treatment 
means that are judged significantly different. 


Suppose, for example, that J = 5 and that 


X95. Ny. SX. <3. 
Then 


1. Consider first the smallest mean x,.. If x;. — x,. = w, proceed to Step 2. However, 
if x5. — x,.<w, connect these first two means with a line segment. Then if 
possible extend this line segment even further to the right to the largest x;. that 
differs from x,. by less than w (so the line may connect two, three, or even more 
means). 


2. Now move to x. and again extend a line segment to the largest x,. to its right that 
differs from x5. by less than w (it may not be possible to draw this line, or alterna- 
tively it may underscore just two means, or three, or even all four remaining means). 


3. Continue by moving to x,, and repeating, and then finally move to x,. 


To summarize, starting from each mean in the ordered list, a line segment is extended as 
far to the right as possible as long as the difference between the means is smaller than w. 
It is easily verified that a particular interval of the form (10.4) will contain 0 if and only 
if the corresponding pair of sample means is underscored by the same line segment. 


EXAMPLE 10.5 An experiment was carried out to compare five different brands of automobile oil 
filters with respect to their ability to capture foreign material. Let 1; denote the true 
average amount of material captured by brand i filters (@ = 1,..., 5) under controlled 
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conditions. A sample of nine filters of each brand was used, resulting in the fol- 
lowing sample mean amounts: x,, = 14.5, x,, = 13.8, x,. = 13.3, x, = 14.3, and 


X;. = 13.1. Table 10.3 is the ANOVA table summarizing the first part of the analysis. 


Table 10.3 ANOVA Table for Example 10.5 


Source of Variation df Sum of Squares Mean Square if 
Treatments (brands) 4 13.32 333 37.84 
Error 40 3:93 .088 

Total 44 16.85 


Since F 99) 4.49 = 5-70, the P-value is smaller than .001. Therefore, H, is rejected (deci- 
sively) at level .05. We now use Tukey’s procedure to look for significant differences 
among the y1;'s. From Appendix Table A.10, Q 955 49 = 4.04 (the second subscript on 


Q is I and not J — 1 as in F), so w = 4.04V.088/9 = .4. After arranging the five 
sample means in increasing order, the two smallest can be connected by a line seg- 
ment because they differ by less than .4. However, this segment cannot be extended 
further to the right since 13.8 — 13.1 = .7 = .4. Moving one mean to the right, the 
pair x,. and x,. cannot be underscored because these means differ by more than .4. 
Again moving to the right, the next mean, 13.8, cannot be connected to any further 
to the right. The last two means can be underscored with the same line segment. 


Ke. ae. ey. Ke: Ly. 
13.1 13.3 13.8 143 145 
Thus brands | and 4 are not significantly different from one another, but are signifi- 
cantly higher than the other three brands in their true average contents. Brand 2 is 
significantly better than 3 and 5 but worse than | and 4, and brands 3 and 5 do not 
differ significantly. 

If x,. = 14.15 rather than 13.8 with the same computed w, then the configura- 
tion of underscored means would be 


Xs. X3. X>, X4. xy. 


13.1 13.3 14.15 14.3 14.5 a 


EXAMPLE 10.6 A biologist wished to study the effects of ethanol on sleep time. A sample of 20 rats, 
matched for age and other characteristics, was selected, and each rat was given an 
oral injection having a particular concentration of ethanol per body weight. The 
rapid eye movement (REM) sleep time for each rat was then recorded for a 24-hour 
period, with the following results: 


Treatment (concentration of ethanol) x; x; 
0 (control) 88.6 73.2 91.4 68.0 75.2 396.4 79.28 
1 g/kg 63.0 53.9 69.2 50.1 71.5 307.7 61.54 
2 g/kg 44.9 59.5 40.2 56.3 38.7 239.6 47.92 
4 g/kg 31.0 39.6 45.3 25:2 22.7 163.8 32.76 


x.. = 1107.5 x.. = 55.375 


Does the data indicate that the true average REM sleep time depends on the 
concentration of ethanol? (This example is based on an experiment reported in 
“Relationship of Ethanol Blood Level to REM and Non-REM Sleep Time and 
Distribution in the Rat,’ Life Sciences, 1978: 839-846.) 
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The x,,’s differ rather substantially from one another, but there is also a great deal 
of variability within each sample. To answer the question precisely we must carry out the 
ANOVA. The smallest and largest of the four sample standard deviations are 9.34 
and 10.18, respectively, which supports the assumption of equal variances. A normal 
probability plot of the 20 residuals shows a reasonably linear pattern, justifying the 
assumption that the four REM sleep time distributions are normal. Thus it is legiti- 
mate to employ the F test. 

Table 10.4 is a SAS ANOVA table. The last column gives the P-value 
as .0001. Using a significance level of .05, we reject the null hypothesis 
Ay: by = By = Bs = My, Since P-value = .0001 < .05 = a. True average REM 
sleep time does appear to depend on concentration level. 


Table 10.4 SAS ANOVA Table 


Analysis of Variance Procedure 
Dependent Variable: TIME 


Sum of Mean 
Source DF Squares Square F Value Pr>F 
Model 3 5882.35750 1960.78583 21.09 0.0001 
Error 16 1487.40000 92.96250 
Corrected 
Total 19 7369.75750 


There are / = 4 treatments and 16 df for error, from which Q 95 416 = 4.05 and 
w = 4.05\V/93.0/5 = 17.47. Ordering the means and underscoring yields 


X4 X3. Xy 


Ly, 
32.76 47.92 61.54 79.28 


The interpretation of this underscoring must be done with care, since we seem to 
have concluded that treatments 2 and 3 do not differ, 3 and 4 do not differ, yet 2 and 
4 do differ. The suggested way of expressing this is to say that although evidence 
allows us to conclude that treatments 2 and 4 differ from one another, neither has 
been shown to be significantly different from 3. Treatment | has a significantly 
higher true average REM sleep time than any of the other treatments. 

Figure 10.5 shows SAS output from the application of Tukey’s procedure. 


Alpha =0.05 df =16 MSE = 92.9625 
Critical Value of Studentized Range = 4.046 
Minimum Significant Difference = 17.446 


Means with the same letter are not significantly different. 


Tukey Grouping Mean N TREATMENT 
A 79.280 5 0 (control) 
B 61.540 5 1 gm/kg 
B 
es B 47.920 5 2 gm/kg 
c 
Cc 32.760 5 4 gm/kg 
Figure 10.5 Tukey's method using SAS B 


The Interpretation of a in Tukey's Method 


We stated previously that the simultaneous confidence level is controlled by Tukey’s 
method. So what does “simultaneous” mean here? Consider calculating a 95% CI 
for a population mean yw based on a sample from that population and then a 95% CI 
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for a population proportion p based on another sample selected independently of the 
first one. Prior to obtaining data, the probability that the first interval will include 
me is .95, and this is also the probability that the second interval will include p. 
Because the two samples are selected independently of one another, the probability 
that both intervals will include the values of the respective parameters is (.95)(.95) = 
(.95)* = .90. Thus the simultaneous or joint confidence level for the two intervals 
is roughly 90%—f pairs of intervals are calculated over and over again from inde- 
pendent samples, in the long run roughly 90% of the time the first interval will cap- 
ture yz and the second will include p. Similarly, if three CI’s are calculated based on 
independent samples, the simultaneous confidence level will be 100(.95)°% ~ 86%. 
Clearly, as the number of intervals increases, the simultaneous confidence level that 
all intervals capture their respective parameters will decrease. 

Now suppose that we want to maintain the simultaneous confidence level at 
95%. Then for two independent samples, the individual confidence level for each 
would have to be 100V/.95% ~ 97.5%. The larger the number of intervals, the 
higher the individual confidence level would have to be to maintain the 95% simul- 
taneous level. 

The tricky thing about the Tukey intervals is that they are not based on inde- 
pendent samples—MSE appears in every one, and various intervals share the same 
X,.s (e.g., in the case J = 4, three different intervals all use x,.). This implies that 
there is no straightforward probability argument for ascertaining the simultane- 
ous confidence level from the individual confidence levels. Nevertheless, it can be 
shown that if Q,,; is used, the simultaneous confidence level is controlled at 95%, 
whereas using Q 9, gives a simultaneous 99% level. To obtain a 95% simultaneous 
level, the individual level for each interval must be considerably larger than 95%. 
Said in a slightly different way, to obtain a 5% experimentwise or family error rate, 
the individual or per-comparison error rate for each interval must be considerably 
smaller than .05. Minitab asks the user to specify the family error rate (e.g., 5%) and 
then includes on output the individual error rate (see Exercise 16). 


Confidence Intervals for Other Parametric 
Functions 


In some situations, a CI is desired for a function of the w;’s more complicated than 
a difference of ; — w;. Let 9 = Xc\u;, where the c,’s are constants. One such func- 
tion is 1/2(u, + by) — 1/3(u3 + fy + ps), Which in the context of Example 10.5 
measures the difference between the group consisting of the first two brands and that 
of the last three brands. Because the X;;’s are normally distributed with E(X;;) = m4, 
and V(X;;) = o°, @ = Xc; X,, is normally distributed, unbiased for 6, and 


2 
v6) = V(Dek,) = Deve) =TDe 


i 


Estimating o? by MSE and forming 6; results in a f variable (6 — 0)/ 64, which can 
be manipulated to obtain the following 100(1 — a)% confidence interval for Xc,u,, 


_ MSEXc? 
yea: + typa—1) a (10.5) 


EXAMPLE 10.7 The parametric function for comparing the first two (store) brands of oil filter with the 
last three (national) brands is @ = 1/2(, + b>) — 1/3(43 + fy + bs), from which 


se=(2) + (3) +S) +(4) + -2 
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With 6 = 1/2(%,. + X,.) — 1/3(¥5. + X,. + Xs.) = 583 and MSE = .088, a 95% 


interval is 


583 + 2.021 


5(.088)/[(6)(9)] = .583 + .182 = (.401, .765) a 


Sometimes an experiment is carried out to compare each of several “new” 
treatments to a control treatment. In such situations, a multiple comparisons tech- 
nique called Dunnett’s method is appropriate. 


EXERCISES Section 10.2 (11-21) 


An experiment to compare the spreading rates of five 
different brands of yellow interior latex paint available in 
a particular area used 4 gallons (J = 4) of each paint. 
The sample average spreading rates (ft?/gal) for the five 
brands were x,. = 462.0, x, = 512.8, x3. = 437.5, 
X4. = 469.3, and x;. = 532.1. The computed value of F 
was found to be significant at level a = .05. With 
MSE = 272.8, use Tukey’s procedure to investigate sig- 
nificant differences in the true average spreading rates 
between brands. 


In Exercise 11, suppose x,. = 427.5. Now which true 
average spreading rates differ significantly from one 
another? Be sure to use the method of underscoring to 
illustrate your conclusions, and write a paragraph sum- 
marizing your results. 


Repeat Exercise 12 supposing that x,. = 502.8 in addi- 
tion to x;. = 427.5. 


Use Tukey’s procedure on the data in Example 10.3 to 
identify differences in true average bond strengths 
among the five protocols. 


Exercise 10.7 described an experiment in which 26 resis- 
tivity observations were made on each of six different 
concrete mixtures. The article cited there gave the fol- 
lowing sample means: 14.18, 17.94, 18.00, 18.00, 25.74, 
27.67. Apply Tukey’s method with a simultaneous confi- 
dence level of 95% to identify significant differences, 
and describe your findings (use MSE = 13.929). 


Reconsider the axial stiffness data given in Exercise 8. 
ANOVA output from Minitab follows: 


Analysis of Variance for Stiffness 


Source DF SS MS F P 
Length 4 43993 10998 10.48 0.000 
Error 30 31475 1049 

Total 34 75468 

Level N Mean StDev 

4 7 333.21 36.59 

6 7 368.06 28.57 

8 7 375.13 20.83 

10 7 407.36 44.51 

12 7 437.17 26.00 


Pooled StDev = 


32.39 


Tukey’s pairwise comparisons 
Family error rate = 0.0500 
Individual error rate = 0.00693 
Critical value = 4.10 
Intervals for (column level mean) - (row level 
mean) 
4 6 8 10 
6 —85.0 
15.4 
8 S92... =5743 
8.3 43.1 
10 —124.3 —89.5 —82.4 
—23.9 10.9 18.0 
12 =154.:2 19:3 =112.2 =80::'0 
=53:<8 = 18.29 =11..:8 20.4 


17. 


18. 


19. 


a. Is it plausible that the variances of the five axial stiff- 
ness index distributions are identical? Explain. 

b. Use the output (without reference to our F table) to 
test the relevant hypotheses. 

c. Use the Tukey intervals given in the output to deter- 
mine which means differ, and construct the corre- 
sponding underscoring pattern. 


Refer to Exercise 5. Compute a 95% ¢t CI for 6 = 
1/2(m, + By) — Bs. 
Consider the accompanying data on plant growth 


after the application of five different types of growth 
hormone. 


1 13 17 7 14 
2 21 13 20 17 
3 18 15 20 17 
4 7 11 18 10 
5 6 11 15 8 


a. Perform an F test at level a = .05. 
b. What happens when Tukey’s procedure is applied? 


Consider a single-factor ANOVA experiment in which 
I=3, J=5,x,. = 10,x,. = 12, and x;.= 20. Find a 
value of SSE for which f> F'o59 15, so that Ay: w, = 
[ = pf, is rejected, yet when Tukey’s procedure is 
applied none of the y,’s can be said to differ signifi- 
cantly from one another. 
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20. Refer to Exercise 19 and suppose x,. = 10, x,, = 15, and a. Test the null hypothesis that true average survival time 
X;. = 20. Can you now find a value of SSE that produces does not depend on an injection regimen against the 
such a contradiction between the F test and Tukey’s proce- alternative that there is some dependence on an injec- 
dure? tion regimen using a = .O1. 

21. The article “The Effect of Enzyme Inducing Agents on the b. Suppose that 100(1 — a)% CIs for k different para- 


metric functions are computed from the same ANOVA 
data set. Then it is easily verified that the simultaneous 
confidence level is at least 100(1 — ka)%. Compute 
Cls with a simultaneous confidence level of at least 


Survival Times of Rats Exposed to Lethal Levels of 
Nitrogen Dioxide” (Toxicology andApplied Pharmacology, 
1978: 169-174) reports the following data on survival times 
for rats exposed to nitrogen dioxide (70 ppm) via different 


injection regimens. There were J = 14 rats in each group. 98% for 
Reames 2 dni) * by — 1/ aie M3 + by + bs + be) and 
; : 1/4 (My +3 + fy + Ms) — Me 
1. Control 166 32 
2. 3-Methylcholanthrene 303 53 
3. Allylisopropylacetamide 266 54 
4. Phenobarbital 212 35 
5. Chlorpromazine 202 34 
6. p-Aminobenzoic Acid 184 31 


10.3. More on Single-Factor ANOVA 


We now briefly consider some additional issues relating to single-factor ANOVA. 
These include an alternative description of the model parameters, 6 for the F test, 
the relationship of the test to procedures previously considered, data transformation, 
a random effects model, and formulas for the case of unequal sample sizes. 


The ANOVA Model 


The assumptions of single-factor ANOVA can be described succinctly by means of 
the “model equation” 


Xi = Bit €y 


where €;, represents a random deviation from the population or true treatment mean 
;, The €;’s are assumed to be independent, normally distributed rv’s (implying that 
the X;’S are also) with E(€;) = 0 [so that E(X;) = p,] and V(E;;) = o° [from which 
V(X) = o” for every i and j]. An alternative description of single-factor ANOVA 
will give added insight and suggest appropriate generalizations to models involving 
more than one factor. Define a parameter pz by 


1 I 
wade 
i=l 


and the parameters a,..., a, by 


a,=p-pB @=1,...,D 


Then the treatment mean p,; can be written as ~ + a,, where mw represents the true 
average overall response in the experiment, and a; is the effect, measured as a depar- 
ture from 1, due to the ith treatment. Whereas we initially had / parameters, we now 
have J + | (uw, a),..., a). However, because Ya; = 0 (the average departure from 
the overall mean response is zero), only J of these new parameters are independently 
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determined, so there are as many independent parameters as there were before. In 
terms of ys and the a,’s, the model becomes 
Xi =pta,t+ E;; G=1,...,68 j=l... 


In Chapter 11, we will develop analogous models for multifactor ANOVA. The 
claim that the z,’s are identical is equivalent to the equality of the a,’s, and because 
2a; = 0, the null hypothesis becomes 


Ayia, = a, = + =a, = 0 


Recall that MSTr is an unbiased estimator of o? when A, is true but otherwise 
tends to overestimate a”. Here is a more precise result: 


Fi 
E(MSTr) = o2 + a4 pat 


When H, is true, Ya? = 0 so E(MSTr) = o? (MSE is unbiased whether or not Hy 
is true). If 2a? is used as a measure of the extent to which A, is false, then a larger 
value of Ya? will result in a greater tendency for MSTr to overestimate o*. In the 
next chapter, formulas for expected mean squares for multifactor models will be 
used to suggest how to form F ratios to test various hypotheses. 


Proof of the Formula for E(MSTr) For any rv Y, E(¥*) = V(Y) + [E(Y)), so 


1 
IJ 


-AiSsy re oe 2) 1 ay 
E(SSTr) (Ix x) 7X) 7 BX) 


= > {V(X,.) + [EX ]} - ol V(xXx..) + [EX..)P} 
= > {Jo? + Get aP} = 7, liso? + UJpy] 
= Io? + Lp? + 2pd >, a, + >> a2 — o — Ly? 
=(I— 1)o? + “p> a? (since >a; = 0) 


The result then follows from the relationship MSTr = SSTr/(J — 1). a 


B for the F Test 


Consider a set of parameter values @,, @,,..., @, for which Hj is not true. The prob- 
ability of a type II error, 6, is the probability that H, is not rejected when that set is 
the set of true values. One might think that 8 would have to be determined separately 
for each different configuration of a;,’s. Fortunately, since 6 for the F test depends 
on the a,’s and o? only through Ya?/o”, it can be simultaneously evaluated for many 
different alternatives. For example, 2a? = 4 for each of the following sets of a,’s for 
which H) is false, so B is identical for all three alternatives: 


1. a, la, lla; =1l,a,=1 
2. a, = —V2, a, = V2, a, = 0, a, = 0 
3. a, = —V3, a, = V1/3, a, = V1/3, a, = V1/3 


The quantity Ja?/o7 is called the noncentrality parameter for one-way 
ANOVA (because when H, is false the test statistic has a noncentral F distribution 
with this as one of its parameters), and 6 is a decreasing function of the value of this 
parameter. Thus, for fixed values of a? and J, the null hypothesis is more likely to be 
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rejected for alternatives far from H, (large 2a?) than for alternatives close to H,. For 
a fixed value of =a?, B decreases as the sample size J on each treatment increases, 
and it increases as the variance o* increases (since greater underlying variability 
makes it more difficult to detect any given departure from H). 

Because hand computation of 6 and sample size determination for the F test 
are quite difficult (as in the case of ¢ tests), statisticians have constructed sets of 
curves from which f can be obtained. Sets of curves for numerator df vy, = 3 and 
v, = 4 are displayed in Figure 10.6* and Figure 10.7*, respectively. After the values 
of o” and the a,’s for which B is desired are specified, these are used to compute the 
value of @, where ? = (J/I)=a?/o7. We then enter the appropriate set of curves at 


99 
MTT 7 
98 SEIS Pe mee 
No 
vy = 3 = 

97 7 ™ S 0) Re wi 

96 r + 
x 


95 va a ~ 

94 ss 
AN. 

92 


90 


Power = 1 — B 


1 2 3 < (for a= .05) 
od (for a= .01) +1 2 3 4 5 


Figure 10.6 Power curves for the ANOVA F test (v, = 3) 


aaa 


vi =4 ne 


a= .05 a=.0l 


Power = 1-8 


1 2 3 < (for a= .05) 
(for a= 01) +1 2 3 4 5 


Figure 10.7 Power curves for the ANOVA F test (v, = 4) 


* From E. S. Pearson and H. O. Hartley, “Charts of the Power Function for Analysis of Variance Tests, 
Derived from the Non-central F Distribution,” Biometrika, vol. 38, 1951: 112. 
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the value of on the horizontal axis, move up to the curve associated with error df 
V>, and move over to the value of power on the vertical axis. Finally, 8 = 1 — power. 


EXAMPLE 10.8 The effects of four different heat treatments on yield point (tons/in’) of steel ingots are 
to be investigated. A total of eight ingots will be cast using each treatment. Suppose 
the true standard deviation of yield point for any of the four treatments is o = 1. How 
likely is it that Hp will not be rejected at level .05 if three of the treatments have the 
same expected yield point and the other treatment has an expected yield point that is 
1 ton/in? greater than the common value of the other three (i.e., the fourth yield is on 
average | standard deviation above those for the first three treatments)? 

Suppose that m, = Mb, =p; and py =m, + 1, = (2p,)/4 = pw, + 1/4. 
Then a, = wh, — w= —1/4, 0, = —1/4, a, = —1/4, a, = 3/4, so 


ele vecle 


and @ = 1.22. Degrees of freedom for the F test are vy, =/—1=3 and 
v, =I (J — 1) = 28, so interpolating visually between v, = 20 and v, = 30 gives 
power ~ .47 and B ~ .53. This B is rather large, so we might decide to increase the 
value of J. How many ingots of each type would be required to yield 8 ~ .05 for the 
alternative under consideration? By trying different values of J, it can be verified that 
J = 24 will meet the requirement, but any smaller J will not. i 


As an alternative to the use of power curves, the SAS statistical software pack- 
age has a function that calculates the cumulative area under a noncentral F curve 
(inputs F,, numerator df, denominator df, and 7), and this area is B. Minitab does 
this and also something rather different. The user is asked to specify the maximum 
difference between p,’s rather than the individual means. For example, we might 
wish to calculate the power of the test when J = 4, w, = 100, w, = 101, w, = 102, 
and j, = 106. Then the maximum difference is 106 — 100 = 6. However, the power 
depends not only on this maximum difference but on the values of all the y1,’s. In this 
situation Minitab calculates the smallest possible value of power subject to w, = 100 
and 1, = 106, which occurs when the two other 2’s are both halfway between 100 
and 106. If this power is .85, then we can say that the power is at least .85 and B is 
at most .15 when the two most extreme ju’s are separated by 6 (the common sample 
size, a, and o must also be specified). The software will also determine the neces- 
sary common sample size if maximum difference and minimum power are specified. 


Relationship of the F Test to the t Test 


When the number of treatments or populations is J = 2, all formulas and results con- 
nected with the F test still make sense, so ANOVA can be used to test Hp: bw, = b> 
versus H,: 4, # My. In this case, a two-tailed, two-sample ¢ test can also be used. 
In Section 9.3, we mentioned the pooled ¢ test, which requires equal variances, as 
an alternative to the two-sample ¢ procedure. It can be shown that the single-factor 
ANOVA F test and the two-tailed pooled ¢ test are equivalent; for any given data 
set, the P-values for the two tests will be identical, so the same conclusion will be 
reached by either test. 

The two-sample ¢ test is more flexible than the F test when J = 2 for two rea- 
sons. First, it is valid without the assumption that 7, = 0}; second, it can be used to 
test H,: ww, > - (an upper-tailed f test) or H,: uw; < p, as well as H,: w, # [5. In the 
case of / = 3, there is unfortunately no general test procedure known to have good 
properties without assuming equal variances. 
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Unequal Sample Sizes 


When the sample sizes from each population or treatment are not equal, let 
J,, J>,..., J, denote the J sample sizes, and let n = &,J; denote the total number of 
observations. The accompanying box gives ANOVA formulas and the test procedure. 


ff Ji 


| 
StS 2 “x2 df=n-1 


i=1j=1 i=1 j=l 


£ od 
SSTr = 5) > &%, — XP = Dee = 7X2 df=I—1 


Al i 


ssE= > Sy, - X= DU,- vst f= SUS pens! 
=1 


i=1j=1 


= SST — SSTr 
Test statistic: 
MSTr SSTr SSE 
= here MSTr = B= 
MSE where MSTr mei MS ae 


Statistical theory says that the test statistic has an F distribution with numera- 
tor df J — 1 and denominator df n — J when H, is true. As in the case of equal 
sample sizes, the larger the value of F, the stronger is the evidence against H). 
Therefore the test is upper-tailed; the P-value is the area under the F,_, , _, 
curve to the right of f. 


EXAMPLE 10.9 The article “On the Development of a New Approach for the Determination of Yield 
Strength in Mg-based Alloys” (Light Metal Age, Oct. 1998: 51-53) presented the 
following data on elastic modulus (GPa) obtained by a new ultrasonic method for speci- 
mens of a certain alloy produced using three different casting processes. 


J, X;. X;. 
Permanent molding 45.5 45.3 45.4 44.4 44.6 43.9 44.6 44.0 8 357.7 44.71 
Die casting 44.2 43.9 44.7 44.2 44.0 43.8 44.6 43.1 8 352.5 44.06 
Plaster molding 46.0 45.9 44.8 46.2 45.1 45.5 6 273.5 45.58 
22 983.7 


Let 2), >, and ww; denote the true average elastic moduli for the three different pro- 
cesses under the given circumstances. The relevant hypotheses are Hy: w, = bh. = 3 
versus H,: at least two of the p's are different. The test statistic is, of course, 
F = MSTr/MSE, based on J— 1 =2 numerator df and n — J =22—3= 19 
denominator df. Relevant quantities include 


983.7? 
> D3 = 43,998.73 CF = = 43,984.80 
SST = 43,998.73 — 43,984.80 = 13.93 
LTP. 352.5? | 273.52 
gtr = 4 3 ABO _ 43 994 94 = 7.93 


8 8 6 
SSE = 13.93 — 7.93 = 6.00 


The remaining computations are displayed in the accompanying ANOVA table. 
Since F991 5 19 = 10.16 < 12.56 = f, the P-value is smaller than .001. Thus the null 
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hypothesis should be rejected at any reasonable significance level; there is compelling 
evidence for concluding that a true average elastic modulus somehow depends on 
which casting process is used. 


Sum of Mean 
Source of Variation df Squares Square f 
Treatments 2 7.93 3.965 12.56 
Error 19 6.00 3158 
Total 21 13.93 


There is more controversy among statisticians regarding which multiple compari- 
sons procedure to use when sample sizes are unequal than there is in the case of 
equal sample sizes. The procedure that we present here is recommended in the 
excellent book Beyond ANOVA: Basics of Applied Statistics (see the chapter bib- 
liography) for use when the J sample sizes J,, J5,...J; are reasonably close to one 
another (“mild imbalance’). It modifies Tukey’s method by using averages of pairs 
of 1/J,’s in place of 1/J. 


eet 


Nell 


MSE/1 1 
Wi = Qriet : es 
i j 
Then the probability is approximately 1 — a that 
Xi. = Vi =p, = =X. eke + wy 


t 


for every i andj (i = 1,..., and j = 1,..., 2) with i #/. 


The simultaneous confidence level 100(1 — a)% is only approximate rather than 
exact as it is with equal sample sizes. Underscoring can still be used, but now the w,; 
factor used to decide whether x;, and x; can be connected will depend on J; and J;. 


EXAMPLE 10.10 The sample sizes for the elastic modulus data were J, = 8, J, = 8, J, = 6, and 
(Example 10.9 I= 3,n—I1= 19, MSE = .316. A simultaneous confidence level of approximately 


continued) 95% requires Q 9539 = 3.59, from which 
316/1 1 
Wi. = 3.59 a g + 8 = 713, W133 771 W3 = 771 


Since x,. — x,. = 44.71 — 44.06 = .65 << wy, uw, and mp are judged not sig- 
nificantly different. The accompanying underscoring scheme shows that 1, and jp, 
appear to differ significantly, as do uw, and p3. 


2. Die 1. Permanent 3. Plaster 
44.06 44.71 45.58 


Data Transformation 


The use of ANOVA methods can be invalidated by substantial differences in the vari- 
ances 7, ..., 07 (which until now have been assumed equal with common value o). 
It sometimes happens that V(X;;) = o? = g(u,), a known function of yy, (so that when 
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H, is false, the variances are not equal). For example, if X;; has a Poisson distribution 
with parameter A; (approximately normal if A; = 10), then pb; = A; and o7 = A,, so 
8(m;) = pm; 1s the known function. In such cases, one can often tianisfoumn the x, S 
to h(X;) so that they will have approximately equal variances (while leaving the 
transformed variables approximately normal), and then the F test can be used on 
the transformed observations. The key idea in choosing h(-) is that often VINX; )| ~ 
V(X, i) [h'(w,) ? = g(u,) + [h'(u,)?. We now wish to find the function A(-) for ‘hich 
g(u,) * [h'(w,)F = c (a constant) for every i. 


PROPOSITION iif. V(X;) = g(u,), a known function of ys, then a transformation h(X;;) that 
= abilizes the variance” so that VAX; .)] is approximately the same for oe i 
is given by h(x) ~ |[g(x)]~ eile 


In the Poisson case, g(x) = x, 80 h(x) should be proportional to [x~!/? dx = 2x!/?. 
Thus Poisson data should be transformed to h(x i) = Vx , before the analysis. 


A Random Effects Model 


The single-factor problems considered so far have all been assumed to be examples 
of a fixed effects ANOVA model. By this we mean that the chosen levels of the 
factor under study are the only ones considered relevant by the experimenter. The 
single-factor fixed effects model is 


X= ht a + da; =0 (10.6) 


where the €;;’s are random and both p and the a,’s are fixed parameters. 

In ae single-factor problems, the pamicule: levels studied by the experi- 
menter are chosen, either by design or through sampling, from a large population of 
levels. For example, to study the effects on task performance time of using different 
operators on a particular machine, a sample of five operators might be chosen from 
a large pool of operators. Similarly, the effect of soil pH on the yield of maize plants 
might be studied by using soils with four specific pH values chosen from among 
the many possible pH levels. When the levels used are selected at random from a 
larger population of possible levels, the factor is said to be random rather than fixed, 
and the fixed effects model (10.6) is no longer appropriate. An analogous random 
effects model is obtained by replacing the fixed a,’s in (10.6) by random variables. 


Xj =pt+A;t+ E;j with E(A;) = E€;) =0 


V(E,) = 07 V(A;) = 0% (10.7) 


all A;’s and é,,’s normally distributed and independent of one another. 


The condition E(A;) = 0 in (10.7) is similar to the condition Ya; = 0 in (10.6); 
it states that the expected or average effect of the ith level measured as a departure 
from p is zero. 

For the random effects model (10.7), the hypothesis of no effects due to dif- 
ferent levels is Hj: 04 = 0, which says that different levels of the factor contribute 
nothing to variability of the response. Although the hypotheses in the single-factor 
fixed and random effects models are different, they are tested in exactly the same 
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way: P-value = area under the F’_,,, _ , curve to the right of f = MSTt/MSE, where 
J\, J,,..., J, are the sample sizes and n = XJ,. This can be justified intuitively by 
noting that E(MSE) = o* and 


> 


E(MST1) = 0° + 74 \n — m a 


(10.8) 


The factor in parentheses on the right side of (10.8) is nonnegative, so again 

E(MSTr) = o? if H) is true and E(MSTr) > o? if Hp is false. 
EXAMPLE 10.11 The study of nondestructive forces and stresses in materials furnishes important 
information for efficient engineering design. The article ““Zero-Force Travel-Time 
Parameters for Ultrasonic Head-Waves in Railroad Rail’ (Materials Evaluation, 
1985: 854-858) reports on a study of travel time for a certain type of wave that 
results from longitudinal stress of rails used for railroad track. Three measurements 
were made on each of six rails randomly selected from a population of rails. The 
investigators used random effects ANOVA to decide whether some variation in 
travel time could be attributed to “between-rail variability.” The data is given in the 
accompanying table (each value, in nanoseconds, resulted from subtracting 36.1 jw’s 
from the original observation) along with the derived ANOVA table. The value f is 
highly significant; H): 03 = 0 is rejected in favor of the conclusion that differences 
between rails is a source of travel-time variability. 


Xx; Source of df Sum of Mean if 
Variation Squares Square 
1: 55 53 54 162 Treatments 5 9310.5 1862.1 115.2 
2: 26 37 32 95 Error 12 194.0 16.17 
3? 78 91 85 254 Total 17 9504.5 
4: 92 100 96 288 
5: 49 51 50 150 
6: 80 85 83-248 
x. = 1197 
| 


EXERCISES Section 10.3 (22-34) 


22. The following data refers to yield of tomatoes (kg/plot) 24. 


23. 


for four different levels of salinity. Salinity level here 
refers to electrical conductivity (EC), where the chosen 
levels were EC = 1.6, 3.8, 6.0, and 10.2 nmhos/cm. 


1.6: 59.5 53.3 56.8 63.1 58.7 
3.8: 55.2 59.1 52.8 54.5 
6.0: 51.7 48.8 53.9 49.0 

10.2: 44.6 48.5 41.0 47.3 46.1 


Use the F test at level a = .05 to test for any differ- 
ences in true average yield due to the different salinity 
levels. 


Apply the modified Tukey’s method to the data in Exer- 
cise 22 to identify significant differences among the 


2 


Hi; S. 


The accompanying summary data on skeletal-muscle CS 
activity (nmol/min/mg) appeared in the article ‘‘Impact 
of Lifelong Sedentary Behavior on Mitochondrial 
Function of Mice Skeletal Muscle” (J. of Gerontology, 
2009: 927-939): 


Old Old 
Young Sedentary Active 
Sample size 10 8 10 
Sample mean 46.68 47.71 58.24 
Sample sd 7.16 5.59 8.43 


Carry out a test to decide whether true average activity differs 
for the three groups. If appropriate, investigate differences 
amongst the means with a multiple comparisons method. 
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25. 


Lipids provide much of the dietary energy in the bodies 
of infants and young children. There is a growing interest 
in the quality of the dietary lipid supply during infancy as 
a major determinant of growth, visual and neural devel- 
opment, and long-term health. The article “‘Essential Fat 
Requirements of Preterm Infants” (Amer. J. of 
Clinical Nutrition, 2000: 245S—250S) reported the fol- 
lowing data on total polyunsaturated fats (%) for infants 
who were randomized to four different feeding regimens: 
breast milk, corn-oil-based formula, soy-oil-based for- 
mula, or soy-and-marine-oil-based formula: 


Regimen 


Sample 
Size 


Sample 
Mean 


Sample 
SD 


Breast milk 8 


CO 
SO 


SMO 14 


43.0 1.5 
3 42.4 1,3 
17 43.1 1.2 
43.5 1.2 


26. 


27. 


a. What assumptions must be made about the four total 
polyunsaturated fat distributions before carrying out 
a single-factor ANOVA to decide whether there are 
any differences in true average fat content? 

b. Carry out the test suggested in part (a). What can be 
said about the P-value? 


Samples of six different brands of diet/imitation marga- 
rine were analyzed to determine the level of physiologi- 
cally active polyunsaturated fatty acids (PAPFUA, in 
percentages), resulting in the following data: 


14.1 136 144 143 
12.8 125 134 13.0 
13.5 134 141 143 
13.2 12.7 126 13.9 
168 17.2 164 17.3 
18.1 17.2 187 18.4 


Imperial 
Parkay 12.3 
Blue Bonnet 
Chiffon 

Mazola 18.0 


Fleischmann’s 


(The preceding numbers are fictitious, but the sample 

means agree with data reported in the January 1975 issue 

of Consumer Reports.) 

a. Use ANOVA to test for differences among the true 
average PAPFUA percentages for the different brands. 

b. Compute CIs for all (u; — p,)’s. 

c. Mazola and Fleischmann’s are corn-based, whereas 
the others are soybean-based. Compute a CI for 


(My + My + Mz + My) — (Ms + Me) 
4 3 


[Hint: Modify the expression for V(6) that led to (10.5) 
in the previous section.] 


Although tea is the world’s most widely consumed bev- 
erage after water, little is known about its nutritional 
value. Folacin is the only B vitamin present in any sig- 
nificant amount in tea, and recent advances in assay 
methods have made accurate determination of folacin 
content feasible. Consider the accompanying data on 
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28. 


29. 


30. 


31. 


32. 


folacin content for randomly selected specimens of the 
four leading brands of green tea. 


1: 7.9 62 66 86 89 10.1 9.6 
2: 5.7 75 98 6.1 84 

3: 68 75 5.0 74 53 6.1 

4. 64 7.1 79 45 5.0 4.0 


(Data is based on ‘‘Folacin Content of Tea,” J. of the 

Amer. Dietetic Assoc., 1983: 627-632.) Does this data 

suggest that true average folacin content is the same for 

all brands? 

a. Carry out a test using a = .05. 

b. Assess the plausibility of any assumptions required 
for your analysis in part (a). 

c. Perform a multiple comparisons analysis to identify 
significant differences among brands. 


For a single-factor ANOVA with sample sizes Ji = 
1,2,...), show that SSTr = 3J(X,. — X..)° = DJ,X?. 
X?, where n = XJ, 


When sample sizes are equal (J; = J), the parameters 
Q,Q@>,...a@, Of the alternative parameterization are 
restricted by Xa; = 0. For unequal sample sizes, the most 
natural restriction is =J,a, = 0. Use this to show that 


1 
E(MSTr) = o? + iA DJ} 


What is E(MSTr) when H, is true? [This expectation 

is correct if 2Jja; = 0 is replaced by the restriction 

2a, = 0 (or any other single linear restriction on the 

a;'s used to reduce the model to J independent param- 

eters), but =J,a; = 0 simplifies the algebra and yields 

natural estimates for the model parameters (in particular, 

a; = X;, — X..).] 

Reconsider Example 10.8 involving an investigation of 

the effects of different heat treatments on the yield point 

of steel ingots. 

a. If J = 8 and o = 1, what is B for a level .05 F test 
when [ly = My, M3 = oy — 1, and wy = pr, + 1? 

b. For the alternative of part (a), what value of J is nec- 
essary to obtain B = .05? 

c. If there are / = 5 heat treatments, J = 10, and o = 1, 
what is B for the level .05 F test when four of the pu,’s 
are equal and the fifth differs by 1 from the other four? 


When sample sizes are not equal, the noncentrality 
parameter is 2J,a?/o7 and ¢? = (1/D2J,a?/o7. Referring 
to Exercise 22, what is the power of the test when 
Ma = M3, by = Mg — 0, and fy = by + 0? 

In an experiment to compare the quality of four different 
brands of magnetic recording tape, five 2400-ft reels of 
each brand (A—D) were selected and the number of flaws 
in each reel was determined. 

A: 10 5 12 14 8 

B: 14 12 17 9 8 

C: 13 18 10 15 18 

D: 17 16 12 22 14 


It is believed that the number of flaws has approximately 
a Poisson distribution for each brand. Analyze the data at 
level .01 to see whether the expected number of flaws per 
reel is the same for each brand. 


33. Suppose that X;, is a binomial variable with parameters n 


and p; (so approximately normal when np, = 10 and 


34. 
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nq, = 10). Then since p; = np, V(X,) = 07 =np(1 — p) = 
(1 — p/n). How should the X,s be transformed so as to 
stabilize the variance? [Hint: g(u;) = (1 — p,/n).] 


Simplify E(MSTr) for the random effects model when 
J Hedy SH SB pa ds 


SUPPLEMENTARY EXERCISES (35-46) 


35. 


36. 


37. 


An experiment was carried out to compare flow rates for 

four different types of nozzle. 

a. Sample sizes were 5, 6, 7, and 6, respectively, and 
calculations gave f = 3.68. State and test the rele- 
vant hypotheses using a = .O1 

b. Analysis of the data using a statistical computer 
package yielded P-value = .029. At level .01, what 
would you conclude, and why? 


Cortisol is a hormone that plays an important role in 
mediating stress. There is growing awareness that expo- 
sure of outdoor workers to pollutants may impact cortisol 
levels. The article “Plasma Cortisol Concentration and 
Lifestyle in a Population of Outdoor Workers” (Intl. J. 
of Envir. Health Res., 2011: 62-71) reported on a study 
involving three groups of police officers: (1) traffic police 
(TP), (2) drivers (D), and (3) other duties (O). Here is 
summary data on cortisol concentration (ng/ml) for a 
subset of the officers who neither drank nor smoked. 


Group Sample Size Mean SD 
TP 47 174.7 50.9 
D 36 160.2 37.2 
O 50 153.5 45.9 


Assuming that the standard assumptions for one-way 
ANOVA are satisfied, carry out a test at significance 
level .05 to decide whether true average cortisol concen- 
tration is different for the three groups. [Note.: The inves- 
tigators used more sophisticated statistical methodology 
(multiple regression) to assess the impact of age, length 
of employment, and drinking and smoking status on 
cortisol concentration; taking these factors into account, 
concentration appeared to be significantly higher in the 
TP group than in the other two groups.] 


Numerous factors contribute to the smooth running of an 
electric motor (“Increasing Market Share Through 
Improved Product and Process Design: An 
Experimental Approach,” Quality Engineering, 1991: 
361-369). In particular, it is desirable to keep motor noise 
and vibration to a minimum. To study the effect that the 
brand of bearing has on motor vibration, five different 
motor bearing brands were examined by installing each 


38. 


39. 


40. 


type of bearing on different random samples of six motors. 
The amount of motor vibration (measured in microns) was 
recorded when each of the 30 motors was running. The 
data for this study follows. State and test the relevant 
hypotheses at significance level .05, and then carry out a 
multiple comparisons analysis if appropriate. 


Mean 
1: 13.1 150 140 144 140 11.6 13.68 
2: 163 15.7 17.2 149 144 17.2 15.95 
3: 13.7 139 124 13.8 14.9 13.3 13.67 
4. 15.7 13.7 144 160 13.9 14.7 14.73 
5: 135 134 13.2 12.7) 134 12.3 13.08 
An article in the British scientific journal Nature 


(“Sucrose Induction of Hepatic Hyperplasia in the 
Rat,” August 25, 1972: 461) reports on an experiment in 
which each of five groups consisting of six rats was put on 
a diet with a different carbohydrate. At the conclusion of 
the experiment, the DNA content of the liver of each rat 
was determined (mg/g liver), with the following results: 


Carbohydrate x 

Starch 2.58 
Sucrose 2.63 
Fructose 2.13 
Glucose 2.41 
Maltose 2.49 


Assuming also that 22x; = 183.4, does the data indicate 
that true average DNA content is affected by the type of 
carbohydrate in the diet? Construct an ANOVA table and 
use a .05 level of significance. 


Referring to Exercise 38, construct a ¢ CI for 


0 = py — (Hy + Mg + My + Ms)/4 


which measures the difference between the average DNA 
content for the starch diet and the combined average for the 
four other diets. Does the resulting interval include zero? 


Refer to Exercise 38. What is B for the test when true 
average DNA content is identical for three of the diets 
and falls below this common value by 1 standard devia- 
tion (o) for the other two diets? 
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41. Four laboratories (1-4) are randomly selected from a 
large population, and each is asked to make three deter- 
minations of the percentage of methyl alcohol in speci- 
mens of a compound taken from a single batch. Based 
on the accompanying data, are differences among labo- 
ratories a source of variation in the percentage of methyl 
alcohol? State and test the relevant hypotheses using 
significance level .05. 


1: 85.06 85.25 84.87 
2: 84.99 84.28 84.88 
3: 84.48 84.72 85.10 
4: 84.10 84.55 84.05 


42. The critical flicker frequency (cff) is the highest fre- 
quency (in cycles/sec) at which a person can detect the 
flicker in a flickering light source. At frequencies above 
the cff, the light source appears to be continuous even 
though it is actually flickering. An investigation carried 
out to see whether true average cff depends on iris color 
yielded the following data (based on the article ‘““The 
Effects of Iris Color on Critical Flicker Frequency,” J. 
of General Psych., 1973: 91-95): 


Iris Color 
1. Brown 2. Green 3. Blue 
26.8 26.4 25.7 
27.9 24.2 27.2 
23.7 28.0 29.9 
25.0 26.9 28.5 
26.3 29.1 29.4 
24.8 28.3 
2557 
24.5 
J; 8 5 6 
x; 204.7 134.6 169.0 
x; 25.59 26.92 28.17 


n=19 x.. = 508.3 


a. State and test the relevant hypotheses at sig- 
nificance level .05 [Hint: VIX; = 13,659.67 and 
CF = 13,598.36.] 

b. Investigate differences between iris colors with respect 
to mean cff. 
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43. Let c),c5,...,c, be numbers satisfying &c; = 0. Then 
XCM; = Cy, +++ +e, is called a contrast in the ,’s. 
Notice that with c;=1, c, 1,65. = 8 = c= 0, 
Xc;;=!, — by which implies that every pairwise differ- 
ence between p's is a contrast (so is, e.g., y— 
Spt) — .543). A method attributed to Scheffé gives simul- 
taneous CI’s with simultaneous confidence level 
100(1 — a)% for all possible contrasts (an infinite number 
of them!). The interval for Xc,1; is 


Daz + (Se *[U— 1): MSE: Ey-ip-d'? 


Using the critical flicker frequency data of Exercise 42, 
calculate the Scheffé intervals for the contrasts 
My — Moy My — M3 Ma — M3, and Sp, + Spy — Ms (this 
last contrast compares blue to the average of brown and 
green). Which contrasts appear to differ significantly 
from 0, and why? 


44. Four types of mortars—ordinary cement mortar (OCM), 
polymer impregnated mortar (PIM), resin mortar (RM), 
and polymer cement mortar (PCM)—were subjected to a 
compression test to measure strength (MPa). Three 
strength observations for each mortar type are given in the 
article “Polymer Mortar Composite Matrices for 
Maintenance-Free Highly Durable Ferrocement” (J. of 
Ferrocement, 1984: 337-345) and are reproduced here. 
Construct an ANOVA table. Using a .05 significance level, 
determine whether the data suggests that the true mean 
strength is not the same for all four mortar types. If you 
determine that the true mean strengths are not all equal, 
use Tukey’s method to identify the significant differences. 


OCM — 32.15 35.53 34.20 
PIM 126.32 126.80 134.79 
RM 117.91 115.02. 114.58 
PCM — 29.09 30.87 29.80 


45. Suppose the x,’s are “coded” by y, = cx, + d. How 
does the value of the F statistic computed from the y,’s 
compare to the value computed from the x;’s? Justify 
your assertion. 


46. In Example 10.11, subtract x;. from each observation in 
the ith sample (i = 1, ..., 6) to obtain a set of 18 residu- 
als. Then construct a normal probability plot and com- 
ment on the plausibility of the normality assumption. 


contains a very well-presented survey of ANOVA; the 
level is comparable to that of the present text, but the 
discussion is more comprehensive, making the book an 
excellent reference. 

Ott, R. Lyman and Michael Longnecker. An Introduction to 
Statistical Methods and Data Analysis (6th ed.), Cengage 
Learning, Boston, 2010. Includes several chapters on 
ANOVA methodology that can profitably be read by stu- 
dents desiring a very nonmathematical exposition; there is 
a good chapter on various multiple comparison methods. 


Multifactor Analysis 


of Variance 


INTRODUCTION 


In the previous chapter, we used the analysis of variance (ANOVA) to test for 
equality of either / different population means or the true average responses 
associated with / different levels of a single factor (alternatively referred to 
as / different treatments). In many experimental situations, there are two or 
more factors that are of simultaneous interest. This chapter extends the meth- 
ods of Chapter 10 to investigate such multifactor situations. 

In the first two sections, we concentrate on the case of two factors. We 
will use / to denote the number of levels of the first factor (A) and J to denote 
the number of levels of the second factor (B). Then there are // possible combi- 
nations consisting of one level of factor A and one of factor B. Each such com- 
bination is called a treatment, so there are // different treatments. The number 
of observations made on treatment (i, / ) will be denoted by K;. In Section 11.1, 
we consider K; = 1. An important special case of this type is a randomized 
block design, in which a single factor A is of primary interest but another factor, 
“blocks,” is created to control for extraneous variability in experimental units 
or subjects. Section 11.2 focuses on the case Kj = K > 1, with brief mention of 
the difficulties associated with unequal K;;'s. 

Section 11.3 considers experiments involving more than two factors. 
When the number of factors is large, an experiment consisting of at least one 
observation for each treatment would be expensive and time consuming. One 
frequently encountered situation, which we discuss in Section 11.4, is that in 
which there are p factors, each of which has two levels. There are then 2? dif- 
ferent treatments. We consider both the case in which observations are made 
on all these treatments (a complete design) and the case in which observations 


are made for only a selected subset of treatments (an incomplete design). 
437 
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11.1 Two-Factor ANOVA with kK; = 1 


When factor A consists of J levels and factor B consists of J levels, there are LJ 
different combinations (pairs) of levels of the two factors, each called a treatment. 
With K;, = the number of observations on the treatment consisting of factor A at level 
i and factor B at level j, we restrict attention in this section to the case K; = 1, so 
that the data consists of JJ observations. Our focus is on the fixed effects model, in 
which the only levels of interest for the two factors are those actually represented in 
the experiment. Situations in which at least one factor is random are discussed briefly 
at the end of the section. 


EXAMPLE 11.1 Is it really as easy to remove marks on fabrics from erasable pens as the word 
erasable might imply? Consider the following data from an experiment to com- 
pare three different brands of pens and four different wash treatments with respect 
to their ability to remove marks on a particular type of fabric (based on “An 
Assessment of the Effects of Treatment, Time, and Heat on the Removal of 
Erasable Pen Marks from Cotton and Cotton/Polyester Blend Fabrics,” J. of 
Testing and Evaluation, 1991: 394-397). The response variable is a quantitative 
indicator of overall specimen color change; the lower this value, the more marks 
were removed. 


Washing Treatment 


1 2 3 4 Total Average 
1 97 A8 A8 46 2.39 598 
Brand of Pen 2 77 14 22 25 1.38 345 
3 .67 39 7 19 1.82 455 
Total 2.41 1.01 1.27 .90 5.59 
Average .803 337 423 .300 466 


Is there any difference in the true average amount of color change due either to the 
different brands of pens or to the different washing treatments? o 


As in single-factor ANOVA, double subscripts are used to identify random 
variables and observed values. Let 


Xi = the random variable (rv) denoting the measurement when factor A is 
held at level i and factor B is held at level j 


xy = the observed value of Xi 


The x;;’s are usually presented in a rectangular table in which the various rows are 
identified with the levels of factor A and the various columns with the levels of 
factor B. In the erasable-pen experiment of Example 11.1, the number of levels 
of factor A is J = 3, the number of levels of factor B is J = 4, x,, = .48, x.) = .14, 
and so on. 
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Whereas in single-factor ANOVA we were interested only in row means and 
the grand mean, now we are interested also in column means. Let 


X;, = the average of measurements obtained 


when factor A is held at level i . - 
X , = the average of measurements obtained __ 2X; 

when factor B is held at level j ~ 7 
Id 

x pes 

X., = the grand mean = Bit - 
IJ 

with observed values x, x,,, and x... Totals rather than averages are denoted by 


omitting the horizontal a (Ge Xx; = Xx; etc.). Intuitively, to see whether there 
is any effect due to the levels of factor A, we should compare the observed x;.’s 
with one another. Information about the different levels of factor B should come 


from the x _;’s. 


The Fixed Effects Model 


Proceeding by analogy to single-factor ANOVA, one’s first inclination in specifying 
a model is to let 1; = the true average response when factor A is at level i and factor 
B at level j. This results in JJ mean parameters. Then let 


He Bg 


where e€;, is the random amount by which the observed value differs from its expec- 
tation. The €,, 8 are assumed normal and independent with common variance 
o. Unfortunately, there is no valid test procedure for this choice of parameters. 
This is because there are /J + 1 parameters (the 1's and 7 7) but only LJ observa- 
tions, so after using each x; as an estimate of yz; there is no way to estimate o. 
The following alternative model is realistic yet involves relatively few 
parameters. 


Assume the existence of J parameters a,,q@,,...,a@, and J parameters 
Bi Bocce [Br such that 


C= Wee ff Se (11.1) 
so that 
My = a; + B; (11.2) 


Including o*, there are now ] + J + 1 model parameters, so if J=3 and J= 3, 
then there will be fewer parameters than observations (in fact, we will shortly 
modify (11.2) so that even J = 2 and/or J = 2 will be accommodated). 

The model specified in (11.1) and (11.2) is called an additive model 
because each mean response j1,; is the sum of an effect due to factor A at level 
i (a,) and an effect due to factor B at level j (B,). The difference between mean 
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responses for factor A at level i and level i’ when B is held at level j is wy — b};- 
When the model is additive, 


My Bij = (a; + B)) — (a, + B)) =a; — Qj 


which is independent of the level j of the second factor. A similar result holds for 
/4i; — bj. Thus additivity means that the difference in mean responses for two lev- 
els of one of the factors is the same for all levels of the other factor. Figure 11.1(a) 
shows a set of mean responses that satisfy the condition of additivity. A nonaddi- 
tive configuration is illustrated in Figure 11.1(b). 


Mean response Mean response 
4 A 


~~ Levels of B Levels of B 


a 
No 
ww 
Bos 


1 2 3 4 
Levels of A Levels of A 
(a) (b) 


Figure 11.1 Mean responses for two types of model: (a) additive; (b) nonadditive 


EXAMPLE 11.2 Plotting the observed x;’s in a manner analogous to that of Figure 11.1 results in 

(Example 11.1 Figure 11.2. Although there is some “crossing over” in the observed x;,’s, the pattern 

continued) is reasonably representative of what would be expected under additivity with just 
one observation per treatment. 


Color change 


Washing treatment 


Figure 11.2 Plot of data from Example 11.1 @ 
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Expression (11.2) is not quite the final model description because the a,’s and 
B,’s are not uniquely determined. Here are two different configurations of the a;’s 
and f,’s that yield the same additive j1,;’s: 


B,=1 fp, =4 B,=2 By =5 
a= 1 | ey = 2 | be =5 a, =0 My =2 | by =5 
Q, = 2 | My =3 | My = 6 a, =1 My, =3 | Ma = 6 


By subtracting any constant c from all @;’s and adding c to all B,’s, other configura- 
tions corresponding to the same additive model are obtained. This nonuniqueness is 
eliminated by use of the following model. 


X,=uta,+B,+¢, (11.3) 


L J. 
where >) a, = 0, > B; = 0, and the €,,’s are assumed independent, normally 
il j=l 


distributed, with mean 0 and common variance o?. 


This is analogous to the alternative choice of parameters for single-factor ANOVA 
discussed in Section 10.3. It is not difficult to verify that (11.3) is an additive 
model in which the parameters are uniquely determined (for example, for the 
i,’ mentioned previously: = 4, a, = —.5,a, = .5, 8, = —1.5, and B, = 1.5). 
Notice that there are only /— 1 independently determined a@,’s and J— 1 
independently determined £,’s. Including p, (11.3) specifies 7+ J —1 mean 
parameters. 

The interpretation of the parameters in (11.3) is straightforward: ju is the true 
grand mean (mean response averaged over all levels of both factors), a, is the effect 
of factor A at level i (measured as a deviation from 1), and B; is the effect of factor 
B at levelj. Unbiased (and maximum likelihood) estimators for these parameters are 


p=X. 4 =X,-X. B.=X,-X. 


L 


There are two different null hypotheses of interest in a two-factor experiment with 
K;, = |. The first, denoted by H),, states that the different levels of factor A have no 
effect on true average response. The second, denoted by Ho, asserts that there is no 
factor B effect. 


Ay @ =a, =~: =a,—0 
versus H,,: at least one a; ~ 0 
11.4 
Hop: By = By =-- = Bp = 0 ( 


versus H,,: at least one B; ~ 0 


(No factor A effect implies that all a;’s are equal, so they must all be 0 since they 
sum to 0, and similarly for the B;’s.) 


Test Procedures 


The description and analysis follow closely that for single-factor ANOVA. There are 
now four sums of squares, each with an associated number of df: 
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DEFINITION Sst — S Sa ce: df= -1 


i=1j=1 


uf 
- 236 - XP =IDG, - df=1-1 


(11.5) 
I Af 
SSB = >| 2, -XP=1DG, -XyY df=s-—- 
i=1 j= j=l 
SSE Sy a De Pal Sa df =(- NV —- 1) 
i=1j=1 
The fundamental identity is 
SST = SSA + SSB + SSE (11.6) 


There are computing formulas for SST, SSA, and SSB analogous to those given in 
Chapter 10 for single-factor ANOVA. But the wide availability of statistical software has 
rendered these formulas almost obsolete. 

The expression for SSE results from replacing pw, a;, and B; by their estima- 
tors in =[X, — (u+a;+ BP. Error df is JJ — number of mean parameters 
estimated = JJ — [1 + (7-1) + (J— 1)] = 7 — 1)V — 1). Total variation is split 
into a part (SSE) that is not explained by either the truth or the falsity of H, or Hop 
and two parts that can be explained by possible falsity of the two null hypotheses. 

Statistical theory now says that if we form F ratios as in single-factor ANOVA, 
when H), (Hog) is true, the corresponding F ratio has an F distribution with numera- 
tor df = J — 1 (J — 1) and denominator df = (J — 1)(J — 1). 


Hypotheses Test Statistic Value P-Value Determination 
MSA 
Hy, versus H,, Wa MSE Area under the Fr) 713-1) 


curve to the right of f, 


MSB 
Hop versus H,p j= = Area under the F_ 1-1) 


whe curve to the right of f, 


EXAMPLE 11.3 The x;.’s and x,;’s for the color-change data are displayed along the margins of the 


(Example 11.2 data table given previously. Table 11.1 summarizes the calculations. 
continued) 


Table 11.1 ANOVA Table for Example 11.3 


Source of Variation df Sum of Squares Mean Square f 
Factor A (brand) I-1=2 SSA = .1282 MSA=.0641  f, = 4.43 
Factor B 

(wash treatment) J-1=3 SSB = .4797 MSB=.1599 — f, = 11.05 
Error (I-1)\J-1)=6 SSE=.0868 MSE =.01447 
Total J-1=11 SST = .6947 
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Because Fy = 3.46 < 4.43 < 5.14 = Fo 556, the P-value for testing Ho, is 
between .05 and .10. Thus H,, cannot be rejected at significance level .05. True 
average color change does not appear to depend on the brand of pen. Since F'9, 36 
= 9.78 and F913 = 23.70, .001 < P-value < .01 for testing Ho,. Therefore this 
null hypothesis is rejected at significance level .05 in favor of the assertion that 
color change varies with washing treatment. A statistical computer package gives 
P-values of .066 and .007 for these two tests. a 


Plausibility of the normality and constant variance assumptions can be investigated 
graphically. Define predicted values (also called fitted values) X;, = pu + &; + B, ~ 
Ha Op SX) cor —X..) =X, + Xj — Xx, and the residuals (the differences 
between the observations and predicted values) x, — *, = xj — X;. — X.; + x... We 
can check the normality assumption with a normal probability plot of the residuals, 
and the constant variance assumption with a plot of the residuals against the fitted 


values. Figure 11.3 shows these plots for the data of Example 11.3. 


Normal Probability Plot of the Residuals Residuals Versus the Fitted Values 
0.15 + ° 
0.10 _| 
e 
e 
e 
v 4 0.05 | 
a g P . 
v O ° 
Ay 4 0.0 
A, a 
4 
-~0.5 - 
e e e 
-—0.10 e 
e 
I I I I I I I I I I 
—0.2 -0.1 0.0 0.1 0.2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 
Residual Fitted Value 
(a) (b) 


Figure 11.3 Diagnostic plots from Minitab for Example 11.3 


The normal probability plot is reasonably straight, so there is no reason to 
question normality for this data set. On the plot of the residuals against the fitted val- 
ues, look for substantial variation in vertical spread when moving from left to right. 
For example, a narrow range for small fitted values and a wide range for high fitted 
values would suggest that the variance is higher for larger responses (this happens 
often, and it can sometimes be cured by replacing each observation by its logarithm). 
Figure 11.3(b) shows no evidence against the constant variance assumption. 


Expected Mean Squares 


The plausibility of using the F tests just described is demonstrated by computing the 
expected mean squares. For the additive model, 


E(MSE) = o2 

J I 
E(MSA) = 0? + da? 

= G4 

I J 
E(MSB) = o? + 4 De 


gel 
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If H,, is true, MSA is an unbiased estimator of 0”, in which case F’, is a ratio of two 
unbiased estimators of 0”. When H,, is false, MSA tends to overestimate o?. Thus 
the larger the value of F’,, the more contradictory is the data to H,,. This explains 
why the test is upper-tailed. Similar comments apply to MSB and Ap,. 


Multiple Comparisons 


After rejecting either H,, or Hz, Tukey’s procedure can be used to identify signifi- 
cant differences between the levels of the factor under investigation. 
1. For comparing levels of factor A, obtain Q, 7 (— )yy—1)- 

For comparing levels of factor B, obtain Q, j.7—1)y7—1): 


2. Compute 


w = Q- (estimated standard deviation of the sample 


means being compared) 


Q.10-10-1)' VMSE/J for factor A comparisons 
Q.40-1u-1)° \VMSE/I __ for factor B comparisons 


(because, e.g., the standard deviation of Xi, is o/ Vi). 


3. Arrange the sample means in increasing order, underscore those pairs differing 
by less than w, and identify pairs not underscored by the same line as correspond- 
ing to significantly different levels of the given factor. 


EXAMPLE 11.4 — Identification of significant differences among the four washing treatments requires 

(Example 11.3. Qos45 = 4.90 and w = 4.90V (.01447)/3 = .340. The four factor B sample means 

continued) (column averages) are now listed in increasing order, and any pair differing by less 
than .340 is underscored by a line segment: 


X4  X.  X3.0 XY. 


300 337 423 .803 


Washing treatment | appears to differ significantly from the other three treatments, 
but no other significant differences are identified. In particular, it is not apparent 
which among treatments 2, 3, and 4 is best at removing marks. a 


Randomized Block Experiments 


In using single-factor ANOVA to test for the presence of effects due to the J dif- 
ferent treatments under study, once the JJ subjects or experimental units have been 
chosen, treatments should be allocated in a completely random fashion. That is, 
J subjects should be chosen at random for the first treatment, then another sample 
of J chosen at random from the remaining JJ — J subjects for the second treatment, 
and so on. 

It frequently happens, though, that subjects or experimental units exhibit het- 
erogeneity with respect to other characteristics that may affect the observed responses. 
Then, the presence or absence of a significant F value may be due to this extraneous 
variation rather than to the presence or absence of factor effects. This is why paired 
experiments were introduced in Chapter 9. The analogy to a paired experiment when 
I> 2 is called a randomized block experiment. An extraneous factor, “blocks,” is 
constructed by dividing the JJ units into J groups with J units in each group. This 
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grouping or blocking should be done so that within each block, the J units are homo- 
geneous with respect to other factors thought to affect the responses. Then within each 
homogeneous block, the / treatments are randomly assigned to the J units or subjects. 


EXAMPLE 11.5 A consumer product-testing organization wished to compare the annual power con- 
sumption for five different brands of dehumidifier. Because power consumption 
depends on the prevailing humidity level, it was decided to monitor each brand at 
four different levels ranging from moderate to heavy humidity (thus blocking on 
humidity level). Within each level, brands were randomly assigned to the five 
selected locations. The resulting observations (annual kWh) appear in Table 11.2, 
and the ANOVA calculations are summarized in Table 11.3. 


Table 11.2 Power Consumption Data for Example 11.5 


Treatments Blocks (humidity level) 
(brands) 1 2 5 4 Xj. coe 
1 685 792 838 875 3190 797.50 
2 722 806 893 953 3374 843.50 
3 733 802 880 941 3356 839.00 
4 811 888 952 1005 3656 914.00 
5 828 920 978 1023 3749 937.25 
xj 3779 4208 4541 4797 17,325 
Xx, 755.80 841.60 908.20 959.40 866.25 
Table 11.3 ANOVA Table for Example 11.5 
Source of Variation df Sum of Squares Mean Square f 
Treatments (brands) 4 53,231.00 13,307.75 Sf, = 95.57 
Blocks | 116,217.75 38,739.25 Sp = 278.20 
Error 12 1671.00 139.25 
Total 19 171,119.75 


The F ratio for treatments considerably exceeds Fy) 41. = 9.63, so P-value < 
.001. Therefore at significance level .05, His rejected in favor of H,,. Power consump- 
tion appears to depend on the brand of humidifier. To identify significantly different 
brands, we use Tukey’s procedure. Q 955) = 4.51 and w =4.51V 139.25/4 = 26.6. 


X4. X3. Xo. X4. Xs. 


797.50 839.00 843.50 914.00 937.25 


The underscoring indicates that the brands can be divided into three groups with 
respect to power consumption. 

Because the block factor is of secondary interest, the corresponding P-value 
is not needed, though the computed value of F, is clearly highly significant. 
Figure 11.4 shows SAS output for this data. At the top of the ANOVA table, the 
sums of squares (SS’s) for treatments (brands) and blocks (humidity levels) are 
combined into a single “model” SS. 
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Analysis of Variance Procedure 


Dependent Variable: POWERUSE 


Sum of Mean 

Source DF Squares Square F Value Pr> FE 
Model 7 169448.750 24206.964 173.84 0.0001 
Error 12 1671.000 139.250 
Corrected Total 19 L71119..750 

R-Square Civ. Root MSE POWERUSE Mean 

0.990235 1.362242 11.8004 866.25000 
Source DF Anova SS Mean Square F Value PR>F 
BRAND 4 53231.000 13307.750 95.57 0.0001 
HUMIDITY 3 116217.750 38739.250 278.20 0.0001 


Alpha=0.05 df=12 MSE=139.25 
Critical Value of Studentized Range = 4.508 
Minimum Significant Difference = 26.597 


Means with the same letter are not significantly different. 


Tukey Grouping Mean N BRAND 
A 937.250 4 5 
A 
A 914.000 
B 843.500 
B 
B 839.000 4 3 
Cc 797.500 4 1 
Figure 11.4 SAS output for power consumption data | 


In many experimental situations in which treatments are to be applied to sub- 
jects, a single subject can receive all J of the treatments. Blocking is then often done 
on the subjects themselves to control for variability between subjects; each subject 
is then said to act as its own control. Social scientists sometimes refer to such exper- 
iments as repeated-measures designs. The “units” within a block are then the differ- 
ent “instances” of treatment application. Similarly, blocks are often taken as different 
time periods, locations, or observers. 


EXAMPLE 11.6 How does string tension in tennis rackets affect the speed of the ball coming off 
the racket? The article ‘Elite Tennis Player Sensitivity to Changes in String 
Tension and the Effect on Resulting Ball Dynamics” (Sports Engr., 2008: 
31-36) described an experiment in which four different string tensions (N) were 
used, and balls projected from a machine were hit by 18 different players. The 
rebound speed (km/h) was then determined for each tension-player combination. 
Consider the following data in Table 11.4 from a similar experiment involv- 
ing just six players (the resulting ANOVA is in good agreement with what was 
reported in the article). 

The ANOVA calculations are summarized in Table 11.5. The P-value for testing 
to see whether true average rebound speed depends on string tension is .049. Thus 
Hy: @; = a, = a, = a, = 0 is barely rejected at significance level .05 in favor of 
the conclusion that true average speed does vary with tension (F953); = 3.29). 
Application of Tukey’s procedure to identify significant differences among tensions 
requires Q 54,5 = 4.08. Then w = 7.464. The difference between the largest and 
smallest sample mean tensions is 6.87. So although the F test is significant, Tukey’s 
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Table 11.4 Rebound Speed Data for Example 11.6 


Player 
Tension 1 2 3 4 5 6 Xj. 
210 105.7 116.6 106.6 113.9 119.4 123.5 114.28 
235 113.3 119.9 120.5 119.3 122.5 124.0 119.92 
260 117.2 124.4 122.3 120.0 115.1 127.9 121.15 
285 110.0 106.8 110.0 115.3 122.6 128.3 115.50 
Xx; 111.55 116.93 114.85 117.13 119.90 125.93 


f 


Table 11.5 ANOVA Table for Example 11.6 


Source df SS MS Ni P 
Tension 3 199.975 66.6582 3.32 0.049 
Player 5 477.464 95.4928 4.76 0.008 
Error 15 301.188 20.0792 

Total 23 978.626 


method does not identify any significant differences. This occasionally happens when 
the null hypothesis is just barely rejected. The configuration of sample means in the 
cited article is similar to ours. The authors commented that the results were con- 
trary to previous laboratory-based tests, where higher rebound speeds are typically 
associated with low string tension. i 


In most randomized block experiments in which subjects serve as blocks, the 
subjects actually participating in the experiment are selected from a large population. 
The subjects then contribute random rather than fixed effects. This does not affect 
the procedure for comparing treatments when K;; = 1 (one observation per “cell,” as 
in this section), but the procedure is altered if K;; = K > 1. We will shortly consider 
two-factor models in which effects are random. 


More on Blocking When / = 2, either the F test or the paired differences tf 
test can be used to analyze the data. The resulting conclusion will not depend on 
which procedure is used, since T? = F and th), = Fay 

Just as with pairing, blocking entails both a potential gain and a potential loss 
in precision. If there is a great deal of heterogeneity in experimental units, the value 
of the variance parameter o” in the one-way model will be large. The effect of block- 
ing is to filter out the variation represented by o” in the two-way model appropriate 
for a randomized block experiment. Other things being equal, a smaller value of 
o” results in a test that is more likely to detect departures from H, (i.e., a test with 
greater power). 

However, other things are not equal here, since the single-factor F' test is 
based on (J — 1) degrees of freedom (df) for error, whereas the two-factor F test is 
based on (J — 1)(J — 1) df for error. Fewer error df results in a decrease in power, 
essentially because the denominator estimator of o? is not as precise. This loss in 
df can be especially serious if the experimenter can afford only a small number of 
observations. Nevertheless, if it appears that blocking will significantly reduce vari- 
ability, the sacrifice of error df is sensible. 
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Models with Random and Mixed Effects 


In many experiments, the actual levels of a factor used in the experiment, rather than 
being the only ones of interest to the experimenter, have been selected from a much 
larger population of possible levels of the factor. If this is true for both factors in a 
two-factor experiment, a random effects model is appropriate. The case in which 
the levels of one factor are the only ones of interest and the levels of the other factor 
are selected from a population of levels leads to a mixed effects model. The two- 
factor random effects model when K;, = | is 


X= eta + Bete G=1..8 fl. J 


The A;’s, B;’s, and €,’s are all independent, normally distributed rv’s with mean 
0 and variances O71 eo and o”, respectively. The hypotheses of interest are then 
Ao: Oa = 0 (level of factor A does not contribute to variation in the Heavens) versus 
H,,: 0% > 0 and Hog: o% = 0 versus H,,: 0% > 0. Whereas E(MSE) = o°” as before, 
the expected mean squares for factors A and B are now 


E(MSA) = o? + Jo% E(MSB) = o? + Io%, 


Thus when Hp, (Hp,) is true, F, (F,) is still a ratio of two unbiased estimators of 07. 
It can be shown that the P-value for testing Hy, versus H,, is computed as for the 
case of fixed effects; an analogous comment applies for testing Hy, versus H,p. 

If factor A is fixed and factor B is random, the mixed model is 


Xj=ebta,+ Bt €, G=1,...,4 j= 1,...,/) 


where Xa; = 0 and the B,’s and €,,’s are normally distributed with mean 0 and vari- 
ances 07 and o , respectively. Now the two null hypotheses are 


Aoi = =a,=0 and Ap,: 03 = 0 
with expected mean squares 


ei 
E(MSE) = 0? ~— -E(MSA) = o? + ro Sia? E(MSB) = o? + Io}, 


The test procedures for H,, versus H,, and Ho, versus H,, are exactly as before. 
For example, in the analysis of the color-change data in Example 11.1, if the 
four wash treatments were randomly selected, then because f, = 11.05 and 
Fosa6 = 4.76, Hop: 7% = 0 is rejected in favor of H,,: 0% > 0. An estimate of the 
“variance component” o% is then given by (MSB — "MSE)/I = = .0485. 
Summarizing, when kK; = 1, although the hypotheses and expected mean 
squares differ from the case of both effects fixed, the test procedures are identical. 


EXERCISES Section 11.1 (1-15) 


1. An experiment was carried out to investigate the effect of b. Test H,: B, = B, = 8B; = 9 (no differences in true 
species (factor A, with J = 4) and grade (factor B, with average strength due to grade) versus H,: at least one 
J = 3) on breaking strength of wood specimens. One B, ~ 0 using a level .05 test. 


observation was made for each species—grade combina- 

tion—resulting in SSA = 442.0, SSB = 428.6, and SSE 

= 123.4. Assume that an additive model is appropriate. 

a. Test Hj: a, = a, =a, =a, =0 (no differences in 
true average strength ae 6 species) versus H,: at 
least one a, ~ 0 using a level .05 test. 


2. Four different coatings are being considered for corrosion 
protection of metal pipe. The pipe will be buried in 
three different types of soil. To investigate whether the 
amount of corrosion depends either on the coating or on 
the type of soil, 12 pieces of pipe are selected. Each piece 
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is coated with one of the four coatings and buried in one 
of the three types of soil for a fixed time, after which the 
amount of corrosion (depth of maximum pits, in .0001 in.) 
is determined. The data appears in the table. 


Soil Type (B) 
1 2 3 


64 49 50 
23 51 48 
47 45 50 
| 43 52 


Coating (A) 


hwWnNe 


a. Assuming the validity of the additive model, carry out 
the ANOVA analysis using an ANOVA table to see 
whether the amount of corrosion depends on either the 
type of coating used or the type of soil. Use a = .05. 

b. Compute (i, &, @5, @3, G4, B), B., and B3. 


An investigation of the machinability of beryllium-copper 
alloy using two different dielectric mediums and four 
different working currents resulted in the following data 
on material removal rate (this is a subset of the data that 
appeared in the article “Statistical Analysis and 
Optimization Study on the Machinability of Beryllium- 
Copper Alloy in Electro Discharge Machining,” J. of 
Engr. Manufacture, 2012: 1847-1861). 


Working Current 
| 10 15 20 25 
Oil | 2433 .3830)- 5625-7258 
Water 1590 = .2649 = 3609-4773 


a. After constructing an ANOVA table, test at level .05 
both the hypothesis of no medium effect against the 
appropriate alternative and the hypothesis of no work- 
ing current effect against the appropriate alternative. 


Medium 


b. Use Tukey’s procedure to investigate differences in 
expected material removal rate due to different work- 
ing currents (Q 543 = 6.825). 


In an experiment to see whether the amount of coverage 
of light-blue interior latex paint depends either on the 
brand of paint or on the brand of roller used, one gallon 
of each of four brands of paint was applied using each of 
three brands of roller, resulting in the following data 
(number of square feet covered). 


Roller Brand 
1 2 3 
1 454 446 451 
Paint 2 446 444 447 
Brand 3 439 442 444 
4 


444 437-443 


a. Construct the ANOVA table. [Hint: The computa- 
tions can be expedited by subtracting 400 (or any 
other convenient number) from each observation. 
This does not affect the final results. ] 
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b. State and test hypotheses appropriate for deciding 
whether paint brand has any effect on coverage. Use 
a= 05, 

c. Repeat part (b) for brand of roller. 

d. Use Tukey’s method to identify significant differ- 
ences among brands. Is there one brand that seems 
clearly preferable to the others? 


In an experiment to assess the effect of the angle of pull 
on the force required to cause separation in electrical 
connectors, four different angles (factor A) were used, 
and each of a sample of five connectors (factor B) was 
pulled once at each angle (“A Mixed Model Factorial 
Experiment in Testing Electrical Connectors,” 
Industrial Quality Control, 1960: 12-16). The data 
appears in the accompanying table. 


1 2 3 4 5 


0° 45.3 42.2 39.6 36.8 45.8 
2° 44.1 44.1 38.4 38.0 47.2 
4 42.7 42.7 42.6 42.2 48.9 
6° 43.5 45.8 47.9 37.9 56.4 


Does the data suggest that true average separation 
force is affected by the angle of pull? State and test the 
appropriate hypotheses at level .01 by first constructing 
an ANOVA table (SST = 396.13, SSA = 58.16, and 
SSB = 246.97). 


A particular county employs three assessors who are 
responsible for determining the value of residential prop- 
erty in the county. To see whether these assessors differ 
systematically in their assessments, 5 houses are selected, 
and each assessor is asked to determine the market value 
of each house. With factor A denoting assessors (J = 3) 
and factor B denoting houses (J=5), suppose 
SSA = 11.7, SSB = 113.5, and SSE = 25.6. 
a. Test Hy): a, = a, = a, = Oat level .05. (A) states that 
there are no systematic differences among assessors.) 
b. Explain why a randomized block experiment with 
only 5 houses was used rather than a one-way 
ANOVA experiment involving a total of 15 different 
houses, with each assessor asked to assess 5 different 
houses (a different group of 5 for each assessor). 


The accompanying data resulted from an experiment 
involving three different brands of lathe in combination 
with three different operators (the blocking factor). The 
response variable is the percentage of acceptable product 
produced during a full workday shift (from ‘‘A Software- 
Based Resource Selection Process in Competitive 
Network Environment Using ANOVA,” Intl. J. of 
Computer Applications, 2012: 17-21). 


Operator 


1 86 85 = 82 
Brand 2 86 86 83 
3 88 91 85 
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a. Construct the ANOVA table and test at level .05 to 
see whether brand of lathe has an effect on product 
acceptability. 

b. Judging from the F ratio for operators (factor B), do 
you think that blocking on operators was effective in 
this experiment? Explain. 


8. The paper ‘Exercise Thermoregulation and 
Hyperprolac-tinaemia”’ (Ergonomics, 2005: 1547-1557) 
discussed how various aspects of exercise capacity might 
depend on the temperature of the environment. The accom- 
panying data on body mass loss (kg) after exercising on a 
semi-recumbent cycle ergometer in three different ambient 
temperatures (6°C, 18°C, and 30°C) was provided by the 
paper’s authors. 


Cold Neutral Hot 


1 4 1.2 1.6 
2 4 1.5 1.9 
3 1.4 8 1.0 
4 id 4 A 
Subject 5 1.1 1.8 2.4 
6 1.2 1.0 1.6 
7 7 1.0 1.4 
8 7 1.5 1.3 
9 8 8 1.1 


a. Does temperature affect true average body mass 
loss? Carry out a test using a significance level of .01 
(as did the authors of the cited paper). 

b. Investigate significant differences among the 
temperatures. 

c. The residuals are .20, .30, —.40, —.07, .30, .00, .03, 
—.20, —.14, .13, .23, —.27, —.04, .03, —.27, —.04, 
333,10) =233)'= 53,467; = 33,2701, —113, 
.24. Use these as a basis for investigating the plausibil- 
ity of the assumptions that underlie your analysis in (a). 


9. The article “The Effects of a Pneumatic Stool and a 
One-Legged Stool on Lower Limb Joint Load and 
Muscular Activity During Sitting and Rising” 
(Ergonomics, 1993: 519-535) gives the accompanying 
data on the effort required of a subject to arise from four 
different types of stools (Borg scale). Perform an analy- 
sis of variance using a = .05, and follow this with a 
multiple comparisons analysis if appropriate. 


Subject 
}1 2 3 4 5 6 78 9 £ 


ie 


1/12 10 7 7 8 9 8 7 9 8.56 
Type 2/15 14 14 11 11 11 12 11 13 12.44 
of 3/12 13 13 10 8 11 12 8 10 10.78 
Stool 4/19 12 9 9 7 100 11 7 8 9.22 


10. The strength of concrete used in commercial construc- 
tion tends to vary from one batch to another. Consequently, 
small test cylinders of concrete sampled from a batch are 
“cured” for periods up to about 28 days in temperature- 
and moisture-controlled environments before strength 
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11. 


12. 


13. 


14. 


15. 


measurements are made. Concrete is then “bought and 
sold on the basis of strength test cylinders” (ASTM C 31 
Standard Test Method for Making and Curing Concrete 
Test Specimens in the Field). The accompanying data 
resulted from an experiment carried out to compare three 
different curing methods with respect to compressive 
strength (MPa). Analyze this data. 


Batch Method A Method B Method C 
1 30.7 33.7 30.5 
2 29.1 30.6 32.6 
3 30.0 32.2 30.5 
4 31.9 34.6 33.5 
5 30.5 33.0 32.4 
6 26.9 29.3 27.8 
7 28.2 28.4 30.7 
8 32.4 32.4 33.6 
9 26.6 29.5 29.2 

10 28.6 29.4 33.2 


For the data of Example 11.5, check the plausibility of 
assumptions by constructing a normal probability plot 
of the residuals and a plot of the residuals versus the 
predicted values, and comment on what you learn. 


Suppose that in the experiment described in Exercise 6 
the five houses had actually been selected at random 
from among those of a certain age and size, so that factor 
B is random rather than fixed. Test Hy: 0% = 0 versus 
H,: 03, > 0 using a level .01 test. 


a. Show that a constant d can be added to (or subtracted 
from) each x, without affecting any of the ANOVA 
sums of squares. 

b. Suppose that each x,; is multiplied by a nonzero con- 
stant c. How does this affect the ANOVA sums of 
squares? How does this affect the values of the F 
statistics F', and F,? What effect does “coding” the 
data by y,, = cx, + d have on the conclusions result- 
ing from the ANOVA procedures? 


Use the fact that E(X;)=t+a,+ 6, with 
Ya; = =P; = 0 to show that E(X;,— X..) =a, so that 
a, = X,. — X.. is an unbiased estimator for a,. 


The power curves of Figures 10.5 and 10.6 can be used to 
obtain 6 = P(type II error) for the F' test in two-factor 
ANOVA. For fixed values of a, @,..., a), the quantity 
¢? = (J/D)a?/o7 is computed. Then the figure corre- 
sponding to vy, = J — 1 is entered on the horizontal axis at 
the value ¢, the power is read on the vertical axis from the 
curve labeled v, = (I — 1)(J — 1), and B = 1 — power. 
a. For the corrosion experiment described in Exercise 2, 
find B when a, = 4,a, =0,a, = a, 2, and 
o =4. Repeat for a, = 6,a, = 0,a3 = a, 35 
anda = 4. 


b. By symmetry, what is B for the test of Hy, versus H,, 
in Example 11.1 when B, = .3, B, = B; = B, el 
and o = .3? 
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11.2 Two-Factor ANOVA with kK; > 1 


In Section 11.1, we analyzed data from a two-factor experiment in which there 
was one observation for each of the JJ combinations of factor levels. The u;’s were 
assumed to have an additive structure with w= w+ a; + B, 2a; = XP; = 0. 
Additivity means that the difference in true average responses for any two lev- 
els of the factors is the same for each level of the other factor. For example, 
Mi — Bij = (w+ a; + B) — (u + a + B;) = a; — a, independent of the level j of 
the second factor. This is shown in Figure 11.1(a) on p. 440, in which the lines con- 
necting true average responses are parallel. 

Figure 11.1(b) depicts a set of true average responses that does not have addi- 
tive structure. The lines connectitng these jz,’s are not parallel, which means that the 
difference in true average responses for different levels of one factor does depend on 
the level of the other factor. When additivity does not hold, we say that there is 
interaction between the different levels of the factors. The assumption of additiv- 
ity in Section 11.1 allowed us to obtain an estimator of the random error variance 
o” (MSE) that was unbiased whether or not either null hypothesis of interest was 
true. When K; > | for at least one (i, j) pair, a valid estimator of 0? can be obtained 
without assuming additivity. Our focus here will be on the case K;, = K > 1, so the 
number of observations per “cell” (for each combination of levels) is constant. 


Fixed Effects Parameters and Hypotheses 


Rather than use the js themselves as model parameters, it is customary to use an 
equivalent set that reveals more clearly the role of interaction. 


1 if 1 
NOTATION aa 2 Di My ae Db; [= 2 ip OLD) 
By J i 


Thus p is the expected response averaged over all levels of both factors (the true 
grand mean), p;. is the expected response averaged over levels of the second factor 
when the first factor A is held at level i, and similarly for p.;. 


DEFINITION a; = b;. — w = the effect of factor A at level i 
B; = .; — » = the effect of factor B at level j 


ee interaction between factor A at oe 
Vii = My — (Qu i J Jevel i and factor B at level j 
from which 
By = Mt a+ Bt Vy cay 


The model is additive if and only if all y;;’s = 0. The y,.’s are referred to as the inter- 
action parameters. The a; s are called the main effects for factor A, and the B,’s are 
the main effects for factor B. Although there are J a;’s, J B;’s, and LJ Vi’ S in addition 
to pw, the conditions Ya; = 0, XB; = 0, 2, = 0 for any i, and >;y,, = 0 for any j [all 
by virtue of (11.7) and (11.8)] imply that only // of these new parameters are indepen- 
dently determined: 1, / — 1 of the a;’s J — 1 of the B,’s, and (I — 1)VJ — 1) of the y;;’s. 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


452 CHAPTER 11 Multifactor Analysis of Variance 


There are now three sets of hypotheses to be considered: 


Fag: Yj = 0 for all i, j versus Hi, 4, at least one y;, ~ 0 
Ay: a, = --- =a,=0 versus H,,: at least one a, ~ 0 
Ao: B, = +: = B, = 0 versus Hx: at least one B; ~ 0 


The no-interaction hypothesis H,,, is usually tested first. If Hp,, is not rejected, then 
the other two hypotheses can be tested to see whether the main effects are signifi- 
cant. If Hy,, is rejected and H,, is then tested and not rejected, the resulting model 
Mb; = «+ B, + y, does not lend itself to straightforward interpretation. In such a 
case, it is best to construct a picture similar to that of Figure 11.1(b) to try to visual- 
ize the way in which the factors interact. 


The Model and Test Procedures 


We now use triple subscripts for both random variables and observed values, with 


Xj, and x,,, referring to the kth observation (replication) when factor A is at level i 


and factor B is at level j. 


The fixed effects model is 
Xie = + a; + B+ Vy + Ee (11.10) 
Hi lersciead Orders fied Waeadiye framea/ cited been 


where the €,,’s are independent and normally distributed, each with mean 0 
and variance o”. 


Again, a dot in place of a subscript denotes summation over all values of that 
subscript, and a horizontal bar indicates averaging. Thus X;. is the total of all K 
observations made for factor A at level i and factor B at level j [all observations in 
the (i, /)th cell], and Xy is the average of these K observations. Test procedures are 
based on the following sums of squares: 


DEFINITION ssl = > DG, df = JK —1 
ae ie 
SSE= > > Ga - df = I(K — 1) 
ba ik 
SsSA= > > YO. xk)? df=I-1 
a 73 
SSB= > > >, - x.) df=J-1 
ig OE 
SSAB = DICT aX de) 
I i c 
The fundamental identity is 
SST = SSA + SSB + SSAB + SSE 
SSAB is referred to as interaction sum of squares. 
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Total variation is thus partitioned into four pieces: unexplained (SSE—which would 
be present whether or not any of the three null hypotheses was true) and three pieces 
that may be attributed to the truth or falsity of the three H,’s. Each of four mean 
squares is defined by MS = SS/df. The expected mean squares suggest that each 
set of hypotheses should be tested using the appropriate ratio of mean squares with 
MSE in the denominator: 


E(MSE) = o2 
E(MSA) = o? + *Saj E(MSB) = o? + pile 
2 K ! . 7 
E(MSAB) = o2 + f= 00 re zy 2 


Each of the three mean square ratios can be shown to have an F distribution with 
appropriate dfs when the associated Hp is true. If Hp,4, is false, the expected value of the 
numerator mean square in F’,, exceeds that of the denominator mean square. The larger 
the value of this F ratio, the stronger is the evidence against the null hypothesis, again 
implying an upper-tailed test. Analogous comments apply to the tests for main effects. 


Hypotheses Test Statistic Value P-Value Determination 
MSA 
Hy, vetsus 4, f= MSE Area under the F)_,, j;¢—1) curve 
to the right of f, 
MSB 
Hog versus Hp he= MSE Area under the F_; j¢—1) curve 
to the right of f, 
MSAB 
Hoag versus Hap fap = MSE Area under the Fyy_jyj—1), aK) 


curve to the right of f,, 


EXAMPLE 11.7 Lightweight aggregate asphalt mix has been found to have lower thermal conductiv- 
ity than a conventional mix, which is desirable. The article “Influence of Selected 
Mix Design Factors on the Thermal Behavior of Lightweight Aggregate Asphalt 
Mixes” (J. of Testing and Eval., 2008: 1-8) reported on an experiment in which 
various thermal properties of mixes were determined. Three different binder grades 
were used in combination with three different coarse aggregate contents (%), with 
two observations made for each such combination, resulting in the conductivity data 
(W/m-°K) that appears in Table 11.6. 


Table 11.6 Conductivity Data for Example 11.7 


Coarse Aggregate Content (%) 
Asphalt Binder Grade 38 41 44 X;.. 
PG58 835, .845 .822, .826 .785, .795 .8180 
PG64 855, .865 .832, .836 790, .800 8297 
PG70 815, .825 .800, .820 .770, .790 .8033 
Xj. .8400 8227 .7883 


Here / = J = 3 and K = 2 for a total of /J/K = 18 observations. The results of the 
analysis are summarized in the ANOVA table which appears as Table 11.7 (a table 
with additional information appeared in the cited article). 
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Table 11.7. ANOVA Table for Example 11.7 


Source DF SS MS f P 
AsphGr 2 .0020893 .0010447 14.12 0.002 
AggCont 2 .0082973 .0041487 56.06 0.000 
Interaction 4 .0003253 .00008 13 1.10 0.414 
Error 9 .0006660 .0000740 

Total 17 .0113780 


The P-value for testing for the presence of interaction effects is .414, which is clearly 
larger than any reasonable significance level, so the interaction null hypothesis can- 
not be rejected. Thus it appears that there is no interaction between the two factors. 
However, both main effects are significant at the 5% significance level (.002 = .05 
and .000 = .05). So it appears that true average conductivity depends on which grade 
is used and also on the level of coarse-aggregate content. 

Figure 11.5(a) shows an interaction plot for the conductivity data. Notice the 
nearly parallel sets of line segments for the three different asphalt grades, in agree- 
ment with the F test that shows no significant interaction effects. True average con- 
ductivity appears to decrease as aggregate content decreases. Figure 11.5(b) shows 
an interaction plot for the response variable thermal diffusivity, values of which 
appear in the cited article. The bottom two sets of line segments are close to parallel, 
but differ markedly from those for PG64; in fact, the F ratio for interaction effects 
is highly significant here. 


A 
0.86 
2.6- 
0.85 
0.84 2.5- 
0.83 
24- 
= 0.82 g 
CS CS 
S S 
0.81 23-4 
0.80 
0.79 22> 
0.78 
2.14 
O77 T T T = T T T a 
38 41 44 38 41 44 
Agg Cont Agg Cont 
(a) (b) 


Figure 11.5 Interaction plots for the asphalt data of Example 11.7. (a) Response variable is 
conductivity. (b) Response variable is diffusivity 


Plausibility of the normality and constant variance assumptions can be assessed 
by constructing plots similar to those of Section 11.1. Define the predicted (i.e., fit- 
ted) values to be the cell means: X ik = Xi. For example, the predicted value for grade 
PG58 and aggregate content 38 is X,,, = (.835 + .845)/2 = .840 for k = 1, 2. The 
residuals are the differences between the observations and corresponding predicted 
values: x, — X;;,- A normal probability plot of the residuals is shown in Figure 11.6(a). 
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The pattern is sufficiently linear that there should be no concern about lack of 
normality. The plot of residuals against predicted values in Figure 11.6(b) shows a bit 
less spread on the right than on the left, but not enough of a differential to be worri- 
some; constant variance seems to be a reasonable assumption. 


Percent 


0.015 -0.010 -0.005 0.000 0.005 0.010 0.015 


Figure 11.6 Plots for checking normality and constant variance assumptions in Example 11.7 


EXAMPLE 11.8 
(Example 11.7 
continued) 


~— as eae ae a 0.010 4 ° ° 
sid ee ee jesoane Bn su hoes 
oe eee 
e ' 0.005 + ee e e e 
boedege AR cctetecess-tes| S a“ «2 
4 ae too} 0.000 
weer e a 7 o 
Se ete teen tte iat m4 * , 
: —— ‘ : 0.005 4 ee e e e 
-0.0104 e e 
T T T T T T T T T T T T T 
0.77 0.78 0.79 0.80 0.81 0.82 0.83 0.84 0.85 0.86 
Residual Fitted Value 
(a) (b) 


Multiple Comparisons 


When the no-interaction hypothesis Hy,, is not rejected and at least one of the two main 
effect null hypotheses is rejected, Tukey’s method can be used to identify significant 
differences in levels. For identifying differences among the a’s when H), is rejected, 


1. Obtain Q,74~«—1), Where the second subscript / identifies the number of lev- 
els being compared and the third subscript refers to the number of degrees of 
freedom for error. 


2. Compute w = QV MSE/(JK), where JK is the number of observations averaged 
to obtain each of the x;..’s compared in Step 3. 


3. Order the x;..’s from smallest to largest and, as before, underscore all pairs that 
differ by less than w. Pairs not underscored correspond to significantly different 
levels of factor A. 


To identify different levels of factor B when Ho, is rejected, replace the second 
subscript in Q by J, replace JK by IK in w, and replace x;.. by x ,.. 


I = J = 3 for both factor A (grade) and factor B (aggregate content). With a = .05 
and error df = W(K — 1) = 9, Qo53.9 = 3.95. The yardstick for identifying significant 
differences is then w = 3.95V.0000740/6 = .00139. The grade sample means in 
increasing order are .8033, .8180, and .8297. Only the difference between the two 
largest means is smaller than w. This gives the underscoring pattern 


PG70 PG58 __ PG64 


Grades PG58 and PG64 do not appear to differ significantly from one another in 
effect on true average conductivity, but both differ from the PG70 grade. 

The ordered means for factor B are .7883, .8227, and .8400. All three pairs of 
means differ by more than .00139, so there are no underscoring lines. True average 
conductivity appears to be different for all three levels of aggregate content. | 
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Models with Mixed and Random Effects 


In some problems, the levels of either factor may have been chosen from a large 
population of possible levels, so that the effects contributed by the factor are random 
rather than fixed. As in Section 11.1, if both factors contribute random effects, the 
model is referred to as a random effects model, whereas if one factor is fixed and the 
other is random, a mixed effects model results. We will now consider the analysis 
for a mixed effects model in which factor A (rows) is the fixed factor and factor B 
(columns) is the random factor. The case in which both factors are random is dealt 
with in Exercise 26. 


DEFINITION The mixed effects model when factor A is fixed and factor B is random is 
Xie = wt a; + B+ Gy t+ Ey 


Pa ee aie ee oe ne 


Here wz and @;’s are constants with 2a; = 0, and the B,’s, G;,;’s, and €;,’s are inde- 
pendent, normally distributed random variables with expected value 0 and variances 
oO}, 07, and 0”, respectively.* The relevant hypotheses here are somewhat different 
from those for the fixed effects model. 


Ay =a, a, — 0 versus H,,: at least one a, ~ 0 
Hog: o% = 0 versus EL oe =O 
Hyg: 0% = 0 versus Heo. 0 


It is customary to test H,, and Hp, only if the no-interaction hypothesis H,,, cannot 
be rejected. 

Sums of squares and mean squares needed for the test procedures are defined 
and computed exactly as in the fixed effects case. The expected mean squares for 
the mixed model are 


E(MSE) = o? 
JK 
E(MSA) _ o + Koz, + f=1 Sa? 


E(MSB) = o? + Koz, + [Koj 
E(MSAB) = o? + Koz 


The ratio f,, = MSAB/MSE is again appropriate for testing the no-interaction 
hypothesis, with the P-value determined as in the fixed effects case. However, for 
testing Hy, versus H,,, the expected mean squares suggest that although the numera- 
tor of the F ratio should still be MSA, the denominator should be MSAB rather than 
MSE. MSAB is also the denominator of the F ratio for testing Hop. 


* This is referred to as an “unrestricted” model. An alternative “restricted” model requires that 2,G,; = 0 
for each j (so the G's are no longer independent). Expected mean squares and F ratios appropriate for 
testing certain hypotheses depend on the choice of model. Minitab’s default option gives output for the 
unrestricted model. 
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11.2. Two-Factor ANOVA with K;>1 457 


For testing H,, versus H,, (A fixed, B random), the test statistic value is 
f, = MSA/MSAB, and the P-value is the area under the F, _ , 7 — ;y — ; curve 
to the right of f,. The test of Hy, versus H,, utilizes f, = MSB/MSAB; the 
P-value is the area under the F, _ ; (, — jy — , curve to the right of fp. 


A process engineer has identified two potential causes of electric motor vibration, 
the material used for the motor casing (factor A) and the supply source of bearings 
used in the motor (factor B). The accompanying data on the amount of vibration 
(microns) resulted from an experiment in which motors with casings made of steel, 
aluminum, and plastic were constructed using bearings supplied by five randomly 
selected sources. 


Supply Source 
1 2 3 4 5 
Steel 13.1 13.2 16.3 15.8 13.7 14.3 15.7 15.8 13,5 12.5 
Material Aluminum = 15.0 14.8 15.7 16.4 13.9 14.3 13.7 14.2 13.4 13.8 
Plastic 14.0 14.3 17.2 16.7 12.4 12.3 14.4 13.9 13.2. 13.1 


Only the three casing materials used in the experiment are under consideration for 
use in production, so factor A is fixed. However, the five supply sources were ran- 
domly selected from a much larger population, so factor B is random. The relevant 
null hypotheses are 


ee ees = ay ee i USD og 
Ay: , = @ =a, =0 Ap: Op = 0 Aap: OG = O 


Minitab output appears in Figure 11.7. The P-value column in the ANOVA table indi- 
cates that the latter two null hypotheses should be rejected at significance level .05. 
Different casing materials by themselves do not appear to affect vibration, but interac- 
tion between material and supplier is a significant source of variation in vibration. 


Factor Type Levels Values 

casmater fixed 3 i 2 3 

source random 5 iL 2 3 4 5 

Source DF Ss MS F P 

casmater 2 0.7047 023523: 0.24 0.790 

source 4 36.6747 9.1687 6.32 0.013 

casmater* source 8 11.6053 1.4507 13.03 0.000 

Error 15 1.6700 0.1123 

Total 29 50.6547 

Source Variance Error Expected Mean Square for Each Term 
component term (using unrestricted model) 

1 casmater 3 (4) +2(3)+Q[1] 

2 source 1.2863 3 (4) +2 (3) +6 (2) 

3 casmater*source 0.6697 4 (4) +2 (3) 

4 Error 0.1113 (4) 


Figure 11.7 Output from Minitab’s balanced ANOVA option for the data of Example 11.9 Ml 


When at least two of the K;;’s are unequal, the ANOVA computations are much 
more complex than for the case K;, = K. In addition, there is controversy as to which 
test procedures should be used. One of the chapter references can be consulted for 
more information. 
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EXERCISES Section 11.2 (16-26) 


16. 


17. 


In an experiment to assess the effects of curing time 
(factor A) and type of mix (factor B) on the compressive 
strength of hardened cement cubes, three different curing 
times were used in combination with four different 
mixes, with three observations obtained for each of the 

12 curing time—mix combinations. The resulting sums of 

squares were computed to be SSA = 30,763.0, SSB = 

34,185.6, SSE = 97,436.8, and SST = 205,966.6. 

a. Construct an ANOVA table. 

b. Test at level .05 the null hypothesis Ho,,: all y,;’s = 0 
(no interaction of factors) against H,,,: at least one 
Y; 7 9. 

c. Test at level .05 the null hypothesis Hy,: a, = a,= 
a, = 0 (factor A main effects are absent) against H,,: at 
least one a; # 0. 

d. Test Hog: B; = By = B; = By = O versus H,,: at least 
one B; ~ 0 using a level .05 test. 

e. The values of the x,..’s were x,.. = 4010.88, x,.. = 
4029.10, and x;.. = 3960.02. Use Tukey’s procedure 
to investigate significant differences among the three 
curing times. 


The article “Towards Improving the Properties of 
Plaster Moulds and Castings” (J. Engr. Manuf., 1991: 
265-269) describes several ANOVAs carried out to study 
how the amount of carbon fiber and sand additions affect 
various characteristics of the molding process. Here we 
give data on casting hardness and on wet-mold strength. 


Sand Carbon Fiber Casting Wet-Mold 
Addition (%) Addition(%) Hardness Strength 

0 0 61.0 34.0 

0 0 63.0 16.0 
15 0 67.0 36.0 
15 0 69.0 19.0 
30 0 65.0 28.0 
30 0 74.0 17.0 

0 25 69.0 49.0 

0 25 69.0 48.0 
15 25 69.0 43.0 
15 2D 74.0 29.0 
30 25 74.0 31.0 
30 25 72.0 24.0 

0 50 67.0 55.0 

0 50 69.0 60.0 
15 50 69.0 45.0 
15 50 74.0 43.0 
30 50 74.0 22.0 
30 50 74.0 48.0 


a. An ANOVA for wet-mold strength gives SS Sand = 
705, SSFiber = 1278, SSE = 843, and SST = 3105. 
Test for the presence of any effects using a = .05. 
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18. 


19. 


b. Carry out an ANOVA on the casting hardness obser- 
vations using a = .05. 

c. Plot sample mean hardness against sand percentage 
for different levels of carbon fiber. Is the plot consis- 
tent with your analysis in part (b)? 


The accompanying data resulted from an experiment to 
investigate whether yield from a certain chemical pro- 
cess depended either on the formulation of a particular 
input or on mixer speed. 


Speed 
60 70 80 
189.7 185.1 189.0 
1 188.6 179.4 193.0 
190.1 177.3 191.1 
Formulation 
165.1 161.7 163.3 
2 165.9 159.8 166.6 
167.6 161.6 170.3 


Astatistical computer package gave SS(Form) = 2253.44, 

SS(Speed) = 230.81, SS(Form*Speed) = 18.58, and 

SSE = 71.87. 

a. Does there appear to be interaction between the factors? 

b. Does yield appear to depend on either formulation or 
speed? 

c. Calculate estimates of the main effects. 

d. The fitted values are Xj, = fi + @; + B + ¥,, and 
the residuals are xj, — X;,. Verify that the residuals 
are .23, —.87, .63, 4.50, —1.20, —3.30, —2.03, 1.97, 
.07, —1.10, —.30, 1.40, .67, —1.23, .57, —3.43, —.13, 
and 3.57. 

e. Construct a normal probability plot from the residuals 
given in part (d). Do the €,,’s appear to be normally 
distributed? 


A two-way ANOVA was carried out to assess the impact 
of type of farm (government agricultural settlement, 
established, individual) and tractor maintenance method 
(preventive, predictive, running, corrective, overhauling, 
breakdown) on the response variable maintenance prac- 
tice contribution. There were two observations for each 
combination of factor levels. The resulting sums of 
squares were SSA = 35.75 (A = type of farm), SSB = 
861.20, SSAB = 603.51, and SSE = 341.82 (“Appraisal 
of Farm Practice Maintenance and Costs in Nigeria,” 
J. of Quality in Maintenance Engr., 2005: 152-168). 
Assuming both factor effects to be fixed, construct an 
ANOVA table, test for the presence of interaction, and 
then test for the presence of main effects for each factor 
(all using level .01). 


20. 


21. 


22. 


The article “Fatigue Limits of Enamel Bonds with 
Moist and Dry Techniques” (Dental Materials, 2009: 
1527-1531) described an experiment to investigate the 
ability of adhesive systems to bond to mineralized tooth 
structures. The response variable is shear bond strength 
(MPa), and two different adhesives (Adper Single Bond 
Plus and OptiBond Solo Plus) were used in combination 
with two different surface conditions. The accompanying 
data was supplied by the authors of the article. The first 12 
observations came from the SBP-dry treatment, the next 
12 from the SBP-moist treatment, the next 12 from the 
OBP-dry treatment, and the last 12 from the OBP-moist 
treatment. 


56.7 57.4 53.4 54.0 49.9 49.9 
56.2 51.9 49.6 45.7 56.8 54.1 
49.2 47.4 53.7 50.6 62.7 48.8 
41.0 57.4 51.4 53.4 55.2 38.9 
38.8 46.0 38.0 47.0 46.2 39.8 
25.9 37.8 43.4 40.2 35.4 40.3 
40.6 35.5 58.7 50.4 43.1 61.7 
33.3 38.7 45.4 47.2 53.3 44.9 


a. Construct a comparative boxplot of the data on the 
four different treatments and comment. 

b. Carry out an appropriate analysis of variance and 
state your conclusions (use a significance level of .01 
for any tests). Include any graphs that provide insight. 

c. Ifa significance level of .05 is used for the two-way 
ANOVA, the interaction effect is significant (just as 
in general different glues work better with some 
materials than with others). So now it makes sense 
to carry out a one-way ANOVA on the four treat- 
ments SBP-D, SBP-M, OBP-D, and OBP-M. Do 
this, and identify significant differences among the 
treatments. 


In an experiment to investigate the effect of “cement fac- 
tor” (number of sacks of cement per cubic yard) on flex- 
ural strength of the resulting concrete (“Studies of 
Flexural Strength of Concrete. Part 3: Effects of 
Variation in Testing Procedure,’ Proceedings, ASTM, 
1957: 1127-1139), 1 =3 different factor values were 
used, J = 5 different batches of cement were selected, 
and K = 2 beams were cast from each cement factor/ 
batch combination. Sums of squares include 
SSA = 22,941.80, SSB = 22,765.53, SSE = 15,253.50, 
and SST = 64,954.70. Construct the ANOVA table. Then, 
assuming a mixed model with cement factor (A) fixed 
and batches (B) random, test the three pairs of hypothe- 
ses of interest at level .05. 


A study was carried out to compare the writing lifetimes 
of four premium brands of pens. It was thought that the 
writing surface might affect lifetime, so three different 
surfaces were randomly selected. A writing machine was 
used to ensure that conditions were otherwise homoge- 
neous (e.g., constant pressure and a fixed angle). The 
accompanying table shows the two lifetimes (min) 
obtained for each brand-surface combination. 
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Writing Surface 
1 2 3 X;.. 
1 709,659 =713,726 660,645 | 4112 
Brand 2 668, 685 722, 740 692, 720 4227 
of Pen 3 659,685 666,684 678,750 | 4122 
4 698,650 704,666 686,733 | 4137 
Xi 5413 5621 5564 = |16,598 


23. 


Carry out an appropriate ANOVA, and state your 
conclusions. 


The accompanying data was obtained in an experiment 
to investigate whether compressive strength of concrete 
cylinders depends on the type of capping material used 
or variability in different batches (‘“The Effect of Type 
of Capping Material on the Compressive Strength of 
Concrete Cylinders,” Proceedings ASTM, 1958: 1166-— 
1186). Each number is a cell total (x;,) based on K = 3 
observations. 


Batch 
1 2 3 4 5 
1 | 1847 1942 1935 1891 1795 
Capping Material 2 | 1779 1850 1795 1785 1626 
3 | 1806 1892 1889 1891 1756 
In addition, VTZXj, = 16,815,853 and DIX; = 


24. 


25. 


26. 


50,443,409. Obtain the ANOVA table and then test at 
level .01 the hypotheses Ho, versus H,¢, Ho, versus H,,, 
and Ho, versus H,,, assuming that capping material is a 
fixed effects factor and batch is a random effects factor. 


a. Show that E(X,.. -X.)= a, So that Xe — X...is an 
unbiased estimator for a; (in the fixed effects model). 

b. With ¥,, = X;, — X;.. — Xj. + X..., show that ¥, is an 
unbiased estimator for y, (in the fixed effects 
model). 


Show how a 100(1 — a)% t CI for a;— a; can be 
obtained. Then compute a 95% interval for a, — a; 
using the data from Exercise 19. [Hint: With 0 = 
a, — a3, the result of Exercise 24(a) indicates how to 
obtain 9. Then compute v(6) and og, and obtain an 
estimate of oj by using V MSE to estimate o (which 
identifies the appropriate number of df).] 


When both factors are random in a two-way ANOVA 

experiment with K replications per combination of factor 

levels, the expected mean squares are E(MSE) = 0°, 

E(MSA) = o? + Koz + JKoj, E(MSB) = 0? + Koz+ 

[Koz and E(MSAB) =o? + Koz.. 

a. What F ratio is appropriate for testing Hy): 07, = 0 
versus H,¢: 07, > 0? 

b. Answer part (a) for testing H),:0% =0 versus 
H,,: 0% > 0 and Hp,: 0% = 0 versus H,,: 0% > 0. 
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11.3. Three-Factor ANOVA 


To indicate the nature of models and analyses when ANOVA experiments involve 
more than two factors, we will focus here on the case of three fixed factors—A, B, 
and C. The numbers of levels of these factors will be denoted by J, J, and K, respec- 
tively, and Lig = the number of observations made with factor A at level i, factor B 
at level j, and factor C at level k. The analysis is quite complicated when the L;;,’s 
are not all equal, so we further specialize to L;, = L. Then Xj; and x;,; denote the 
observed value, before and after the eeponments is performed, of the /th replication 
(1 = 1, 2,..., L) when the three factors are fixed at levels i, j, and k. 

To understand the parameters that will appear in the three-factor ANOVA 
model, first recall that in two-factor ANOVA with replications, E(X;) = by = b+ 
a; + B; + yj where the restrictions 2a; = 2B; = 0,2; =0 for every j, and 
2)Y; = 0 for every i were necessary to obtain a unique set of parameters. If we use 


i 
dot subscripts on the j1,’s to denote averaging (rather than summation), then 


1 1 
=p Deby 7 De aba = % 


is the effect of factor A at level i averaged over levels of factor B, whereas 


1 
Hay — By = By — 7 Da By = GF Yi 


is the effect of factor A at level i specific to factor B at level 7. When the effect of A 
at level i depends on the level of B, there is interaction between the factors, and the 
y;; 8 are not all zero. In particular, 


Mig — Beg — By + Me = Vy (1.11) 


The Fixed Effects Model and Test 
Procedures 


The fixed effects model for three-factor ANOVA with L;, = L is 


X; Ep bh Mee ay ee J 
ijkl ~ Mix ijkl tI al iP 12) 
Kea Ke l= Nels 


where the €;,,’s are normally distributed with mean 0 and variance o”, and 


Mi = B+ a, + B, + 6, a ar ye i eo ae (ues) 


The restrictions necessary to obtain uniquely defined parameters are that the sum 
over any subscript of ay caus on the right-hand side of (11.13) equal 0. 

The parameters ye » vc, and 7; ie are called two-factor interactions, and y;, 
is called a three-factor interaction; the a,’s, B,’s, and 6,’s are the main effects param- 
eters. For any fixed level k of the third factor, ailosous to (11.11), 


Mije — Bik ~ Maj + bok = = 4° + Vinx 
is the interaction of the ith level of A with the jth level of B specific to the kth level 
of C, whereas 
Bay, — Wy. ~ Bag + he = VRP 
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is the interaction between A at level i and B at level j averaged over levels of C. If the 
interaction of A at level i and B at level j does not depend on k, then all y;;’s equal 
0. Thus nonzero y;,,’s represent nonadditivity of the two-factor Go's over the various 
levels of the third factor C. If the experiment included more than three factors, there 
would be corresponding higher-order interaction terms with analogous interpreta- 
tions. Note that in the previous argument, if we had considered fixing the level of 
either A or B (rather than C, as was done) and examining the y;,,’s, their interpretation 
would be the same; if any of the interactions of two factors depend on the level of the 
third factor, then there are nonzero Vijx’S- 

When L > 1, there is a sum of squares for each main effect, each two-factor 
interaction, and the three-factor interaction. To write these in a way that indicates 
how sums of squares are defined when there are more than three factors, note that 
any of the model parameters in (11.13) can be estimated unbiasedly by averaging 
Xj Over appropriate subscripts and taking differences. Thus 


p= X... @, =X, — X.... VP = Ky = KX, + Mon 


ie jess afr 


Ve = Xap. — Xy.. — Xie. — Kp + %.. + Ky + Ky — KX... 


y 


with other main effects and interaction estimators obtained by symmetry. 


DEFINITION Relevant sums of squares are 


SsT = 3 DD D> Xa — X...? df = JKL — 1 
aa Oe ot 
SSA = >, >, >, D4? = JKLD &,.. —X..? df=1-1 
i j k 1 i 
SSAB = >) > > > dt 3h ap) 
Fe aa aoe) 


= KLY, Y&, —%,. —X;. FX) 
ey 
SSABC= SDD, 1 Sane = 
LL fg 


SSE = > DS >a am xo df = IJK(L — 1) 
ce ae? 


with the remaining main effect and two-factor interaction sums of squares 
obtained by symmetry. SST is the sum of the other eight SSs. 


Each sum of squares (excepting SST) when divided by its df gives a mean 


square. Expected mean squares are 
E(MSE) = o? 
JKL 
E(MSA) =, o AP j=1 a? 


= KL By2 
E(MSAB) = 0? + T= pngay 2D) 


L 
E(MSABC) = o? + >) oe 
ij k 


I-1J- ly(k-1 
with similar expressions for the other expected mean squares. Main effect and inter- 


action hypotheses are tested by forming F ratios with MSE in each denominator. 
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Null Hypothesis Test Statistic Value P-Value Determination 
: MSA 
Hy,: all a,’s = 0 ha MSE Area under the Fy_; jyxiz-1) curve 
to the right of f, 
fae _ MSAB 
Hog aly s=0 > fag = MSE Area under the Fy_1)y—1), ye—1) 
curve to the right of f,, 
’ MSABC 
Auyscaly,s=0 fiaec= =~ Area under the 
‘ MSE 
Fu-yu-1y(K- 1), UK(L-1) CUFVE to 
the right of fipc 


Usually the main effect hypotheses are tested only if all interactions are judged not 
significant. 

This analysis assumes that Lin =L> 1. If L = 1, then as in the two-factor 
case, the highest-order interactions must be assumed absent to obtain an MSE that 
estimates 0”. Setting L = | and disregarding the fourth subscript summation over /, 
the foregoing formulas for sums of squares are still valid, and error sum of squares 
is SSE = PP a TN with Xie = X ix in the expression for Vine 

EXAMPLE 11.10 There has been increased interest in recent years in renewable fuels such as biodiesel, 
a form of diesel fuel derived from vegetable oils and animal fats. Advantages over 
petroleum diesel include nontoxicity, biodegradability, and lower greenhouse gas 
emissions. The article “Application of the Full Factorial Design to Optimization 
of Base-Catalyzed Sunflower Oil Ethanolysis” (Fuel, 2013: 433-442) reported 
on an investigation of three factors on the purity (%) of the biodiesel fuel fatty acid 
ethyl ester (FAEE). The factors and levels are as follows: 


A: Reaction temperature 25°C, 50°C, 75°C 
B: Ethanol-to-oil molar ratio 6:1, 9:1, 12:1 
C: Catalyst loading .75 wt.%, 1.00 wt.%, 1.25 wt. % 


The data appears in Table 11.8, where J = J = K = 3 andL = 2. 


Table 11.8 Purity (%) data for Example 11.10 


B, B, B, 

C C, C, C C, C; C, Cc, C,; 
. 81.07 88.71 95.42 81.54 89.12 96.32 86.07 92.05 97.02 

82.22 87.61 94.06 82.82 86.49 95.45 87.73 91.72 96.16 
4, 8731 89.52 94.68 87.99 90.05 96.44 89.61 90.32 98.30 
2 87.94 88.75 95.45 88.98 90.42 96.47 89.02 90.61 96.62 
A, 90:66 91.60 93.65 92.14 92.55 97.41 92.88 96.12 97.66 
3 


91.87 92.34 95.73 92.22 97.06 97.08 93.30 97.41 97.59 


The resulting ANOVA table is shown in Table 11.9. The P-value for testing Ho,,¢ 18 
.165, which is larger than any sensible significance level. This null hypothesis there- 
fore cannot be rejected; it appears that the extent of interaction between any pair of 
factors is the same for each level of the remaining factor. 

Figure 11.8 shows two-factor interaction plots. For example, the dots in the 
plot appearing in the C row and B column represent the x_,.’s—that is, the observa- 
tions averaged over the levels of the first factor for each combination of levels of the 
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Table 11.9 ANOVA Table for the Purity Data of Example 11.10 


Source DF Ss MS F P 
A 2 215.385 107.692 112.07 0.000 
B 2 74.506 37.253 38.77 0.000 
Cc 2 602.717 301.358 313.60 0.000 
A*B 4 13.452 3.363 3.50 0.020 
A*¥C 4 107.409 26.852 27.94 0.000 
B*C 4 4.374 1.094 1.14 0.360 
A*B*C 8 12.472 1.559 1.62 0.165 


second and third factors. The bottom three dots connected by solid line segments 
represent the third level of factor C at each level of factor B. The fact that connected 
line segments are quite close to being parallel is evidence for the absence of BC 
interactions, and indeed the P-value in Table 11.9 for testing this null hypothesis 
is .360. However, the P-values for testing Hj,, and Hy,- are .020 and .000, respec- 
tively. So at significance level .05, we are forced to conclude that AB interactions and 
AC interactions are present. The line segments in the AC interaction plot are clearly 
not close to being parallel. It appears from the interaction plots that expected purity 
will be maximized when all factors are at their highest levels. As it happens, this is 
also the message from the main effects plots, but those cannot generally be trusted 
when interactions are present. 


Interaction Plot for Purity 


Data Means 
1 2 3 

95 4 
90 5 
85 + 

c 
95 5 

—e—1 

—-n-2 

aS Cc --o--3 
85 4 


T T T 
1 2 3 


Figure 11.8 Interaction and main effect plots from MINITAB for Example 11.10 
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Main Effects Plot for Purity 


Data Means 
A B Cc 
96 5 
94 
S 
5 92- 
= 
90 5 
88 + 
T T T T T T T T i 
1 2 3 1 2 3 di 2 3 
Figure 11.8 (continued) B 


Diagnostic plots for checking the normality and constant variance assumptions 
can be constructed as described in previous sections. Tukey’s procedure can be used 
in three-factor (or more) ANOVA. The second subscript on Q is the number of sam- 
ple means being compared, and the third is degrees of freedom for error. 

Models with random and mixed effects are also sometimes appropriate. Sums 
of squares and degrees of freedom are identical to the fixed effects case, but expected 
mean squares are, of course, different for the random main effects or interactions. A 
good reference is the book by Douglas Montgomery listed in the chapter bibliography. 


Latin Square Designs 


When several factors are to be studied simultaneously, an experiment in which there 
is at least one observation for every possible combination of levels is referred to as a 
complete layout. If the factors are A, B, and C with J, J, and K levels, respectively, a 
complete layout requires at least //K observations. Frequently an experiment of this size 
is either impracticable because of cost, time, or space constraints or literally impossible. 
For example, if the response variable is sales of a certain product and the factors are dif- 
ferent display configurations, different stores, and different time periods, then only one 
display configuration can realistically be used in a given store during a given time period. 

A three-factor experiment in which fewer than JJK observations are made is 
called an incomplete layout. There are some incomplete layouts in which the pat- 
tern of combinations of factors is such that the analysis is straightforward. One such 
three-factor design is called a Latin square. It is appropriate when J = J = K (e.g., 
four display configurations, four stores, and four time periods) and all two- and three- 
factor interaction effects are assumed absent. If the levels of factor A are identified 
with the rows of a two-way table and the levels of B with the columns of the table, 
then the defining characteristic of a Latin square design is that every level of factor C 
appears exactly once in each row and exactly once in each column. Figure 11.9 shows 
examples of 3 x 3,4 X 4,and5 X 5 Latin squares. There are 12 different 3 < 3 Latin 
squares, and the number of different Latin squares increases rapidly with the number 
of levels (e.g., every permutation of rows of a given Latin square yields a Latin square, 


B B 

Cc Cc Cc 
NY 1 2 3 12 3 4 12 3 4°55 
1); 1 2.3 1);3 4 2 1 1/43 5 2 1 
A2|2. 3 1 A2z|4 2 1 3 2/3 14 5 2 
a3) Po2 3/2 13 4 A3|}15 2 3 4 
4/1 3 4 2 4/5 2 1 4 3 
5)};2 4 3 1°55 


Figure 11.9 Examples of Latin squares 
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and similarly for column permutations). It is recommended that the square used in a 
an actual experiment be chosen at random from the set of all possible squares of the 
desired dimension; for further details, consult one of the chapter references. 

The letter NV will denote the common value of J, J, and K. Then a complete lay- 
out with one observation per combination would require N* observations, whereas a 
Latin square requires only N* observations. Once a particular square has been chosen, 
the value of k (the level of factor C) is completely determined by the values of i and j. 
To emphasize this, we use x, to denote the observed value when the three factors are 
at levels i, j, and k, respectively, with k taking on only one value for each i, j pair. 


The model equation for a Latin square design is 


Xi = Bt a; + B; ate Opole E(k) i,j,k =1,...,N 


where 2a; = XB; = 26, = 0 and the €;,,)'s are independent and normally dis- 
tributed with mean 0 and variance o7. 


We employ the following notation for totals and averages: 


X= DXiw sae DXiw X= DXiw X= 2 DXiw 
J i iJ t J 


Note that although X;.. previously suggested a double summation, now it corresponds 
to a single sum over all 7 (and the associated values of k). 
DEFINITION Sums of squares for a Latin square experiment are 
Ss = > aan df =N2-1 
ie) 
Shy a 
py 
SSB df=N-1 
te) 
sSc— a df=N-1 
eo) 
SSE = Se et 8, ON df=N-1 
ae | 
= Gy ae ee df = (N — 1)(N — 2) 
aa 
SST = SSA + SSB + SSC + SSE 


Each mean square is, of course, the ratio SS/df. For testing Hj)-: 6; = 6, = ++: = 
dy = 0, the test statistic value is f, = MSC/MSE, and the P-value is the area under 
the Fy — 1, w— 1-2 Curve to the right of fc. The other two main effect null hypoth- 
eses are tested analogously. 

If any of the null hypotheses is rejected, significant differences can be identi- 
fied by using Tukey’s procedure. After computing w = Q, yy—1)-2) ° VW MSE/N, 
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pairs of sample means (the x,..’s, x.;.’s, or x.,,’8) differing by more than w correspond 
to significant differences between associated factor effects (the a;’s, B;’s, or 6,’s). 
The hypothesis Hp, is frequently the one of central interest. A Latin square 
design is used to control for extraneous variation in the A and B factors, as was 
done by a randomized block design for the case of a single extraneous factor. Thus 
in the product sales example mentioned previously, variation due to both stores 
and time periods is controlled by a Latin square design, enabling an investigator 
to test for the presence of effects due to different product-display configurations. 


EXAMPLE 11.11 In an experiment to investigate the effect of relative humidity on abrasion resistance 
of leather cut from a rectangular pattern (““The Abrasion of Leather,” J. Inter. Soc. 
Leather Trades’ Chemists, 1946: 287), a 6 X 6 Latin square was used to control for 
possible variability due to row and column position in the pattern. The six levels of 
relative humidity studied were 1 = 25%, 2 = 37%, 3 = 50%, 4 = 62%, 5 = 75%, 
and 6 = 87%, with the following results: 


B (columns) 


1 4 5 6 Xx; 

1 37,38 25.50 5.01 16.79 35.10 

2 27 15 45.78 36.24 65.06 37.35 

3 46.75 35.31 '7.81 78.05 39.90 

A (rows) 4 18.05 65.46 46.05 55.51 37.83 
5 65.65 36.54 7.03 45.96 37.89 

6 56.00 18.02 65.80 36.61 38.91 

x 40.98 36.61 37.94 37.98 


a 


Also, x.., = 46.10, x... = 40.59, x.., = 39.56, x..4 = 35.86, x..5 = 32.23, x..6 = 32.64, 
x... = 226.98. Further computations are summarized in Table 11.10. 


Table 11.10 ANOVA Table for Example 11.11 


Source of Variation df Sum of Squares Mean Square f 
A (rows) 5 2.19 438 2.50 
B (columns) 5: 2.57 514 2.94 
C (treatments) 5 23:93 4.706 26.89 
Error 20 3.49 175 

Total 35 31.78 


Since F 991 5.9 = 6.46 < 26.89, P-value < .001. Thus Ho, is rejected at any sensible 
significance level in favor of the hypothesis that relative humidity does on average 
affect abrasion resistance. 

To apply Tukey’s procedure, w = Qos 629 * VW MSE/6 = 4.45 V.175/6 = .76. 
Ordering the x..,’s and underscoring yields 


715% 871% 62% 50% 37% 25% 
3.37 5.44 5.98 6.59 6.77 7.68 


In particular, the lowest relative humidity appears to result in a true average abrasion 
resistance significantly higher than for any other relative humidity studied. | 
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EXERCISES Section 11.3 (27-37) 


27. 


28. 


29. 


The output of a continuous extruding machine that coats 
steel pipe with plastic was studied as a function of the 
thermostat temperature profile (A, at three levels), the 
type of plastic (B, at three levels), and the speed of 
the rotating screw that forces the plastic through a tube- 
forming die (C, at three levels). There were two replica- 
tions (L = 2) at each combination of levels of the fac- 
tors, yielding a total of 54 observations on output. The 
sums of squares were SSA = 14,144.44, SSB = 5511.27, 
SSC = 244,696.39, SSAB = 1069.62, SSAC = 62.67, 
SSBC = 331.67, SSE = 3127.50, and SST = 270,024.33. 
a. Construct the ANOVA table. 
b. Use appropriate F tests to show that none of the F 
ratios for two- or three-factor interactions is significant 
at level .05. 
ce. Which main effects appear significant? 
d. With x..,. = 8242, x.,. = 9732, and x.3. = 11,210, 
use Tukey’s procedure to identify significant differ- 
ences among the levels of factor C. 


To see whether thrust force in drilling is affected by drill- 
ing speed (A), feed rate (B), or material used (C), an 
experiment using four speeds, three rates, and two materi- 
als was performed, with two samples (L = 2) drilled at 
each combination of levels of the three factors. Sums of 
squares were calculated as follows: SSA = 19,149.73, 
SSB = 2,589,047.62, SSC = 157,437.52, SSAB = 53, 
238.21, SSAC = 9033.73, SSBC = 91,880.04, SSE = 56, 
819.50, and SST = 2,983,164.81. Construct the ANOVA 
table and identify significant interactions using a = .01. Is 
there any single factor that appears to have no effect on 
thrust force? (In other words, does any factor appear non- 
significant in every effect in which it appears?) 


The article “Effects of Household Fabric Softeners 
on Thermal Comfort of Cotton and Polyester 
Fabrics After Repeated Launderings” (Family and 
Consumer Science Research J., 2009: 535-549) 
reported the results of a three-factor ANOVA carried 
out to investigate the impact of fabric softener treat- 
ment (A: no softener, rinse-cycle softener, dryer-sheet 
softener), fabric type (B: 100% cotton, 100% polyes- 
ter), and number of laundering cycles (C: 1, 5, 25) on 
air permeability of fabric, which is an important deter- 
minant of thermal comfort. 

a. Five observations were made for each combination 
of factor levels. Resulting sums of squares were 
SSA = 1043.27, SSB = 112,148.10, SSC = 3020.97, 
SSAB = 373.52, SSAC = 392.71, SSBC = 145.95, 
SSABC = 54.13, and SSE = 339.30. Create an 
ANOVA table and carry out tests of all relevant 
hypotheses using a significance level of .01. 


30. 


31. 


b. Because the test for the presence of three-factor 
interactions is insignificant, it makes sense to inves- 
tigate two-factor interactions. Use the following 
values of various sample means to create interaction 
plots, and comment as to whether they are consistent 
with the test results of (a). 


Al A2 A3 
Bl 67.10 56.50 65.93 
B2 138.00 131.93 131.40 
Gl 110.25 105.55 103.30 
c2 101.80 90.45 97.10 
c3 95.60 86.65 95.60 


(The cited article included a plot and commentary 
based on the AC means.) 


The following summary quantities were computed from 
an experiment involving four levels of nitrogen (A), two 
times of planting (B), and two levels of potassium (C) 
(“Use and Misuse of Multiple Comparison Proce- 
dures,” Agronomy J., 1977: 205-208). Only one obser- 
vation (N content, in percentage, of corn grain) was made 
for each of the 16 combinations of levels. 


SSA = .22625 SSB = .000025 SSC = .0036 
SSAB = .004325 SSAC = .00065 
SSBC = .000625 SST = .2384. 


a. Construct the ANOVA table. 

b. Assume that there are no three-way interaction 
effects, so that MSABC is a valid estimate of o”, and 
test at level .05 for interaction and main effects. 

c. The nitrogen averages are x,.. = 1.1200, x,..= 
1.3025, x3.. = 1.3875, and x,.. = 1.4300. Use Tukey’s 
method to examine differences in percentage N among 
the nitrogen levels (Q 9; 43 = 6.82). 


Nickel titanium (NiTi) shape memory alloy (SMA) has 
been widely used in medical devices. This is attributable 
largely to the alloy’s shape memory effect (material 
returns to its original shape after heat deformation), 
superelasticity, and biocompatibility. An alloy element is 
usually coated on the surface of NiTi SMAs to prevent 
toxic Ni release. 

The article ““Parametrical Optimization of Laser 
Surface Alloyed NiTi Shape Memory Alloy with Co 
and Nb by the Taguchi Method” (J. of Engr. Manuf., 
2012: 969-979) described an investigation to see whether 
the percent by weight of nickel in the alloyed layer is 
affected by carbon monoxide powder paste thickness 
(C, at three levels), scanning speed (B, at three levels), 
and laser power (A, at three levels). One observation was 
made at each factor-level combination [Note: Thickness 
column headings were incorrect in the cited article]: 
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32. 


Paste Thickness 
Power Speed 32 3 4 

600 600 38.64 35.13 19.20 
900 38.16 34.24 26.23 

1200 37.54 33.46 30.44 

700 600 36.56 35.91 34.62 
900 39.16 33.10 28.71 

1200 37.06 31.78 21.50 

800 600 39.44 40.42 37.21 
900 39.34 37.64 35.65 

1200 39.30 34.97 32.50 


a. Assuming the absence of three factor interactions (as 
did the investigators), SSE = SSABC can be used to 
obtain an estimate of o*. Construct an ANOVA table 
based on this data. 

b. Use the appropriate F ratios to show that none of the 
two-factor interactions is significant at a = .05. 

c. Which main effects are significant at a =.05? 

d. Use Tukey’s procedure with a simultaneous confi- 
dence level of 95% to identify significant differences 
between levels of paste thickness. 


When factors A and B are fixed but factor C is random 
and the restricted model is used (see the footnote on 
page 438; there is a technical complication with the 
unrestricted model here), and E(MSE) = o? 


JKL 

E(MSA) = 0? + JLojc 4 ai Ya 
IKL 

E(MSB) = o? + ILo}, 4 ram > 86} 


E(MSC) = 0? + IJLo?. 
E(MSAB) = 6? + Lo2gc 
KL 
*G=pg—y SY 
E(MSAC) = 0? + JLo%¢ 
E(MSBC) = 0? + ILo2¢ 
E(MSABC) = 0? + o2g¢ 


a. Based on these expected mean squares, what F ratios 
would you use to test Hy: Oigc = 0; Hy: 02 = 0; 
Ay: yi2 = 0 for all i, 7; and Hy: a; = +++ = a, = 0? 

b. In an experiment to assess the effects of age, type 
of soil, and day of production on compressive 
strength of cement/soil mixtures, two ages (A), four 
types of soil (B), and three days (C, assumed ran- 
dom) were used, with L = 2 observations made for 
each combination of factor levels. The resulting sums 
of squares were SSA = 14,318.24, SSB = 9656.40, 
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33. 


34. 


35. 


SSC = 2270.22, SSAB = 3408.93, SSAC = 1442.58, 
SSBC = 3096.21, SSABC = 2832.72, and SSE = 
8655.60. Obtain the ANOVA table and carry out all 
tests using level .01. 


Because of potential variability in aging due to different 
castings and segments on the castings, a Latin square 
design with N = 7 was used to investigate the effect of heat 
treatment on aging. With A = castings, B = segments, 
C=heat treatments, summary statistics include 
xX... = 3815.8, Ex7.. = 297,216.90, 2x5, = 297,200.64, 
2x2, = 297,155.01, and LExjq) = 297,317.65. Obtain 
the ANOVA table and test at level .05 the hypothesis that 
heat treatment has no effect on aging. 


The article “The Responsiveness of Food Sales to 
Shelf Space Requirements” (J. Marketing Research, 
1964: 63-67) reports the use of a Latin square design to 
investigate the effect of shelf space on food sales. The 
experiment was carried out over a 6-week period using 
six different stores, resulting in the following data on 
sales of powdered coffee cream (with shelf space index 
in parentheses): 


Week 
1 2 3 
1 27 (5) 14 (4) 18 (3) 
2 34 (6) 31 (5) 34 (4) 
3 39 (2) 67 (6) 31 (5) 
Store 4 40 (3) 57 (1) 39 (2) 
5 15 (4) 15 (3) 11 (1) 
6 16 (1) 15 (2) 14 (6) 
Week 
4 5 6 
1 35 (1) 28 (6) 22 (2) 
2 46 (3) 37 (2) 23 (1) 
3 49 (4) 38 (1) 48 (3) 
Store 4 70 (6) 37 (4) 50 (5) 
5 9 (2) 18 (5) 17 (6) 
6 12 (5) 19 (3) 22 (4) 


Construct the ANOVA table, and state and test at level 
.O1 the hypothesis that shelf space does not affect sales 
against the appropriate alternative. 


The article “Variation in Moisture and Ascorbic Acid 
Content from Leaf to Leaf and Plant to Plant in Turnip 
Greens” (Southern Cooperative Services Bull., 1951: 
13-17) uses a Latin square design in which factor A is 
plant, factor B is leaf size (smallest to largest), factor C (in 
parentheses) is time of weighing, and the response variable 
is moisture content. 
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Leaf Size (B) from Cotton and Cotton/Polyester Blend Fabrics (J. of 


Testing and Eval., 1991: 394-397) reports the following 
1 2 3 sums of squares for the response variable degree of 
l of marks: SSA = 39.171, SSB = .665, SSC = 
1 | 66765 7.15(4) 8.291 icon ni 
©) ® (1) 21.508, SSAB = 1.432, SSAC = 15.953, SSBC = 1.382, 
2 5.40 (2) 4.77 (5) 5.40 (4) ’ 
SSABC = 9.016, and SSE = 115.820. Four different 
Plant (A) 3 7.32 (3) 8.53 (2) 8.50 (5) : ; 
laundry treatments, three different types of pen, and six 
4 4.92 (1) 5.00 (3) 7.29 (2) : : : ; 
5 4.88 (4) 6.16 (1) 7.83 (3) different fabrics were used in the experiment, and there 
i i ; were three observations for each treatment-pen-fabric 
combination. Perform an analysis of variance using 
Leaf Size (B) a = .01 for each test, and state your conclusions (assume 
ri 5 fixed effects for all three factors). 
37. <A four-factor ANOVA experiment was carried out to 
1 8.95 (3) 9.62 (2) investigate the effects of fabric (A), type of exposure 
2 7.54 (1) 6.93 (3) (B), level of exposure (C), and fabric direction (D) on 
Plant (A) 3 9.99 (4) 9.68 (1) extent of color change in exposed fabric as measured 
4 7.85 (5) 7.08 (4) by a spectrocolorimeter. Two observations were made 
5 5.83 (2) 8.51 (5) for each of the three fabrics, two types, three levels, 
and two directions, resulting in MSA = 2207.329, 
When all three factors are random, the expected mean MSB = 47.255, MSC = 491.783, MSD = .044, 
squares are E(MSA) = 0? + No2, E(MSB) = o2 + No?, MSAB = 15.303, MSAC = 275.446, MSAD = .470, 
MSBC = 2.141, MSBD = .273, MSCD = .247, 


E(MSC) = o? + No2., and E(MSE) = o°. This implies 


that the F ratios for testing Ho,: 71 = 0, Hog: 0% = 0, MSABC = 3.714, MSABD = 4.072, MSABD = 4.072, 


36. 


and Hy: 0% = 0 are identical to those for fixed effects. 
Obtain the ANOVA table and test at level .05 to see 
whether there is any variation in moisture content due 
to the factors. 


The article ‘An Assessment of the Effects of Treatment, 


MSACD = .767, MSBCD = .280, MSE = .977, and 
MST = 93.621 (“Accelerated Weathering of Marine 
Fabrics,” J. Testing and Eval., 1992: 139-143). 
Assuming fixed effects for all factors, carry out an anal- 
ysis of variance using a = .01 for all tests and summa- 


i lusions. 
Time, and Heat on the Removal of Erasable Pen Marks Shed ar ave 


11.4 2? Factorial Experiments 


If an experimenter wishes to study simultaneously the effect of p different factors on a 
response variable and the factors have J, /5,..., L, levels, respectively, then a complete 
experiment requires at least J, - I, ++: I, observations. In such situations, the experi- 
menter can often perform a “screening experiment” with each factor at only two levels 
to obtain preliminary information about factor effects. An experiment in which there 
are p factors, each at two levels, is referred to as a 2’ factorial experiment. 


2° Experiments 


As in Section 11.3, we let X;, and x;, refer to the observation from the /th repli- 
cation, with factors A, B, and C at levels i, j, and k, respectively. The model for 
this situation is 


Xing = w+ 0; + By + 8, + YAP + VAS + VEO + vie + Ee 


for (= 1,2;7=1,2;k = 1,2;]=1,...,n. The Ex S are assumed independent, 
normally distributed, with mean 0 and variance o. Because there are only two levels 
of each factor, the side conditions on the parameters of (11.14) that uniquely spec- 
ify the model are simply stated: a, + a, = 0,..., yA? + y3i2 = 0, yi2 + ys? = 0, 
vib + yiB = 0, y\8 + y58 = 0, and the like. These conditions imply that there is only 


(11.14) 
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one functionally independent parameter of each type (for each main effect and interac- 
tion). For example, a, = —a,, whereas y32 = —y/8, ya? = —yA8, and y32 = yi2. 
Because of this, each sum of squares in the analysis will have 1 df. 

The parameters of the model can be estimated by taking averages over vari- 
ous subscripts of the X;,,,'s and then forming appropriate linear combinations of the 
averages. For example, 


a, = Mins ~ Xe 
= (Xi. fig X01. at Xp. + X99. X11. X51. Xy01. Xp.) 
8n 


and 


iB = Xj. as Xi... a Xy.. TX, 
(X11. — Xqa1. — Xa. + Xoa1. + Xi. — Xv. — Xo. + X00.) 
8n 
Each estimator is, except for the factor 1/(8n), a linear function of the cell totals 
(X;S) in which each coefficient is +1 or —1, with an equal number of each; such 


functions are called contrasts in the X;,’s. Furthermore, the estimators satisfy the 
same side conditions satisfied by the parameters themselves. For example, 


1 1 2 1 1 
= — X,.. + —X)... XX... =X... X....=0 
An 4n 8n 4n 4n 


EXAMPLE 11.12 In an experiment to investigate the compressive strength properties of cement-soil 
mixtures, two different aging periods were used in combination with two different 
temperatures and two different soils. Two replications were made for each combi- 
nation of levels of the three factors, resulting in the following data: 


Soil 
Age Temperature 1 2 
1 1 471, 413 385, 434 
2 485, 552 530, 593 
2 1 712, 637 770, 705 
2 712, 789 741, 806 


The computed cell totals are x,,,. = 884, x,,,. = 1349, x,5). = 1037, x5). = 1501, 
X19. = 819, X55. = 1475, x15. = 1123, and x... = 1547, so x.... = 9735. Then 
Q, = (884 — 1349 + 1037 — 1501 + 819 — 1475 + 1123 — 1547)/16 
= —125.5625 = —a, 
448 = (884 — 1349 — 1037 + 1501 + 819 — 1475 — 1123 + 1547)/16 
= —14.5625 = — Yi? = —ar = V2" 


The other parameter estimates can be computed in the same manner. a 


Analysis of a 2° Experiment Sums of squares for the various effects are easily 
obtained from the parameter estimates. For example, 


2 
SSA = DDS DA = 4nd G? = 4n[@2 + (—4,)"] = 8nd? 
ij kil i=1 
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and 
ssaB = SDD Dey"? 
ij k ol 


2... 12, 
= 20S DG? = AUCH? + (HNP? + (HP? + HPP] 


i=1j=1 


= 8n(¥}2) 


Since each estimate is a contrast in the cell totals multiplied by 1/(87), 
each sum of squares has the form (contrast)?/(8). Thus to compute the various 
sums of squares, we need to know the coefficients (+1 or —1) of the appropriate 
contrasts. The signs (+ or —) on each X ip. in each effect contrast are most con- 
veniently displayed in a table. We will use the notation (1) for the experimental 
condition i=1,j=1,k=1,afori=2,j=1,k=1, ab fori=2,j=2,k =1, 
and so on. If level | is thought of as “low” and level 2 as “high,” any letter that 
appears denotes a high level of the associated factor. Each column in Table 11.11 
gives the signs for a particular effect contrast in the x,,’s associated with the dif- 
ferent experimental conditions. 


Table 11.11 Signs for Computing Effect Contrasts 


Experimental Cell Factorial Effect 
Condition Total A B C AB AC BC ABC 
() X11. 
a Xai. 
b Xy01. 
ab X91. 
¢ X 112. 
ac Xpy9. 
be X99. 
abc Xoy9. 


In each of the first three columns, the sign is + if the corresponding factor is 
at the high level and — if it is at the low level. Every sign in the AB column is then 
the “product” of the signs in the A and B columns, with (+)(+) = (—)(—) = + and 
(+)(—) = (—)(+) = —, and similarly for the AC and BC columns. Finally, the signs 
in the ABC column are the products of AB with C (or B with AC or A with BC). Thus, 
for example, 


AC contrast = + X44). — Xoyy. FX yo). — X91. 7 Kyo. F X19. — Xyy2. F X20. 


Once the seven effect contrasts are computed, 


(effect contrast)? 
8n 


SS(effect) = 


Software for doing the calculations required to analyze data from factorial exper- 
iments is widely available (e.g., Minitab). Alternatively, here is an efficient method for 
hand computation due to Yates. Write in a column the eight cell totals in the standard 
order, as given in the table of signs, and establish three additional columns. In each 
of these three columns, the first four entries are the sums of entries | and 2, 3 and 4, 
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5 and 6, and 7 and 8 of the previous columns. The last four entries are the differences 
between entries 2 and 1, 4 and 3, 6 and 5, and 8 and 7 of the previous column. The last 
column then contains x.... and the seven effect contrasts in standard order. Squaring 
each contrast and dividing by 8n then gives the seven sums of squares. 


EXAMPLE 11.13 Since n = 2, 8n = 16. Yates’s method is illustrated in Table 11.12. 
(Example 11.12 
continued) 

Table 11.12 Yates’s Method of Computation 


Treatment 
Condition Xi. 1 2 Effect Contrast SS = (contrast)?/16 
(1) =a. 884.» 2233, > 4771 9735 
=X, 134920 253 4964 2009 252,255.06 
b = x49, 1037 \\ 2294 929 681 28,985.06 
ab=Xy,, 1501 670 080 ~233 3,393.06 
C= yp. 819 465 305 193 2,328.06 
aC = Xp. 1475 464 376 151 1,425.06 
be = X19. 1123 656 -1 71 315.06 
abe =X «1547 424-232 ~231 3,335.06 
292,036.42 


From the original data, 22,2.) xj; = 6,232,289, and 
He 
16 = 5,923,139.06 
Ne) 
SST = 6,232,289 — 5,923,139.06 = 309,149.94 
SSE = SST — [SSA + --- + SSABC] = 309,149.94 — 292,036.42 
17,113.52 
The ANOVA calculations are summarized in Table 11.13. 


Table 11.13 ANOVA Table for Example 11.13 


Source of 

Variation df Sum of Squares Mean Square f 

A 1 252,255.06 252,255.06 117.92 
B 1 28,985.06 28,985.06 13.55 
C 1 2,328.06 2,328.06 1.09 
AB 1 3,393.06 3,393.06 1.59 
AC 1 1,425.06 1,425.06 67 
BC 1 315.06 315.06 LS 
ABC 1 3,335.06 3,335.06 1.56 
Error 8 17,113.52 2,139.19 

Total 15 309,149.94 


Figure 11.10 shows SAS output for this example. Only the P-values for age 
(A) and temperature (B) are less than .01, so only these effects are judged significant. 
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Analysis of Variance Procedure 


Dependent Variable: STRENGTH 
Sum of Mean 
Source DF Squares Square F Value Pr > F 
Model 7 292036.4375 41719.4911 19.50 0.0002 
Error 8 17113 .5000 2139.1875 
Corrected Total 15 309149.9375 
R-Square opal ae Root MSE POWERUSE Mean 
0.944643 7.601660 46.25135 608.437500 
Source DF Anova SS Mean Square F Value Pr > F 
AGE 1 252255.0625 252255.0625 V7 .92 0.0001 
TEMP 1 28985.0625 28985.0625 133.55 0.0062 
AGE* TEMP 1 3393.0625 3393 .0625 1.59 0.2434 
SOIL 1 2328.0625 2328.0625 15:09 0.3273 
AGE*SOIL 1 1425.0625 1425.0625 0.67 0.4380 
TEMP*SOIL 1 315.0625 315.0625 0.15 0.7111 
AGE* TEMP* SOIL 1 3335..0625 3335.0625 15:56 0.2471 
Figure 11.10 SAS output for strength data of Example 11.13 || 


2° Experiments for p > 3 


The analysis of data from a 2? experiment with p > 3 parallels that of the three-factor 
case. For example, if there are four factors A, B, C, and D, there are 16 different 
experimental conditions. The first 8 in standard order are exactly those already listed 
for a three-factor experiment. The second 8 are obtained by placing the letter d beside 
each condition in the first group. Yates’s method is then initiated by computing totals 
across replications, listing these totals in standard order, and proceeding as before; 
with p factors, the pth column to the right of the treatment totals will give the effect 
contrasts. 

For p > 3, there will often be no replications of the experiment (so only one 
complete replicate is available). One possible way to test hypotheses is to assume 
that certain higher-order effects are absent and then add the corresponding sums 
of squares to obtain an SSE. Such an assumption can, however, be misleading 
in the absence of prior knowledge (see the book by Montgomery listed in the 
chapter bibliography). An alternative approach involves working directly with 
the effect contrasts. Each contrast has a normal distribution with the same vari- 
ance. When a particular effect is absent, the expected value of the corresponding 
contrast is 0, but this is not so when the effect is present. The suggested method 
of analysis is to construct a normal probability plot of the effect contrasts (or, 
equivalently, the effect parameter estimates, since estimate = contrast/2? when 
n = 1). Points corresponding to absent effects will tend to fall close to a straight 
line, whereas points associated with substantial effects will typically be far from 
this line. 


The accompanying data is from the article “Quick and Easy Analysis of 
Unreplicated Factorials” (Technometrics, 1989: 469-473). The four factors are 
A = acid strength, B = time, C = amount of acid, and D = temperature, and the 
response variable is the yield of isatin. The observations, in standard order, are .08, 
04, .53, .43, .31, .09, .12, .36, .79, .68, .73, .08, .77, .38, .49, and .23. Table 11.14 
displays the effect estimates as given in the article (which uses contrast/8 rather than 
contrast/16). 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


474 CHAPTER 11 Multifactor Analysis of Variance 


Table 11.14 Effect Estimates for Example 11.14 


Effect A B AB Cc AC BC ABC D 
Estimate 191 021 001 .076 034 —.066 149 274 
Effect AD BD ABD CD ACD BCD ABCD 
Estimate 161 251 101 .026 .066 124 019 


Figure 11.11 is anormal probability plot of the effect estimates. All points in the plot 
fall close to the same straight line, suggesting the complete absence of any effects 
(we will shortly give an example in which this is not the case). 


Effect estimate 
0.3 
0.2 
0.1 


0.0 ee? 


1 1 | | > percentile 
—2 -1 0 1 2 


Figure 11.11 A normal probability plot of effect estimates from Example 11.14 


Visual judgments of deviation from straightness in a normal probability plot 
are rather subjective. The article cited in Example 11.14 describes a more objective 
technique for identifying significant effects in an unreplicated experiment. 


Confounding 


It is often not possible to carry out all 2? experimental conditions of a 2? factorial 
experiment in a homogeneous experimental environment. In such situations, it may 
be possible to separate the experimental conditions into 2” homogeneous blocks 
(r < p), so that there are 2?~" experimental conditions in each block. The blocks 
may, for example, correspond to different laboratories, different time periods, or 
different operators or work crews. In the simplest case, p = 3 and r = 1, so that 
there are two blocks, with each block consisting of four of the eight experimental 
conditions. 

As always, blocking is effective in reducing variation associated with extrane- 
ous sources. However, when the 2” experimental conditions are placed in 2” blocks, 
the price paid for this blocking is that 2” — 1 of the factor effects cannot be esti- 
mated. This is because 2” — 1 factor effects (main effects and/or interactions) are 
mixed up, or confounded, with the block effects. The allocation of experimental 
conditions to blocks is then usually done so that only higher-level interactions are 
confounded, whereas main effects and low-order interactions remain estimable and 
hypotheses can be tested. 
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To see how allocation to blocks is accomplished, consider first a 2* experiment 
with two blocks (r = 1) and four treatments per block. Suppose we select ABC as the 
effect to be confounded with blocks. Then any experimental condition having an odd 
number of letters in common with ABC, such as b (one letter) or abc (three letters), 
is placed in one block, whereas any condition having an even number of letters in 
common with ABC (where 0 is even) goes in the other block. Figure 11.12 shows 
this allocation of treatments to the two blocks. 


Block 1 Block 2 


(1), ab, ac, be a, b, c, abc 


Figure 11.12 Confounding ABC in a 2? experiment 


In the absence of replications, the data from such an experiment would usually 
be analyzed by assuming that there were no two-factor interactions (additivity) and 
using SSE = SSAB + SSAC + SSBC with 3 df to test for the presence of main 
effects. Alternatively, a normal probability plot of effect contrasts or effect parameter 
estimates could be examined. Most frequently, though, there are replications when 
just three factors are being studied. Suppose there are u replicates, resulting in a 
total of 2” - u blocks in the experiment. Then after subtracting from SST all sums of 
squares associated with effects not confounded with blocks (computed using Yates’s 
method), the block sum of squares is computed using the 2” - wu block totals and then 
subtracted to yield SSE (so there are 2” - u — 1 df for blocks). 


EXAMPLE 11.15 The article “Factorial Experiments in Pilot Plant Studies” Undustrial and Eng. 
Chemistry, 1951: 1300-1306) reports the results of an experiment to assess the effects 
of reactor temperature (A), gas throughput (B), and concentration of active constituent 
(C) on the strength of the product solution (measured in arbitrary units) in a recirculation 
unit. Two blocks were used, with the ABC effect confounded with blocks, and there were 
two replications, resulting in the data in Figure 11.13. The four block X replication totals 
are 288, 212, 88, and 220, with a grand total of 808, so 


(288)? + (212)? + (88)? + (220)? (808)? 


SSBI = = 5204.00 
4 16 
Replication 1 Replication 2 
Block 1 Block 2 Block 1 Block 2 


Figure 11.13 Data for Example 11.15 


The other sums of squares are computed by Yates’s method using the eight experi- 
mental condition totals, resulting in the ANOVA table given as Table 11.15. By com- 
parison with F'9; | 6 = 5.99, we conclude that only the main effects for A and C differ 
significantly from zero (P-value < .05 for just f, and f,). 
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Table 11.15 ANOVA Table for Example 11.15 


Source of 
Variation df Sum of Squares Mean Square f 
A 1 12,996 12,996 39.82 
B if 702.25 702.25 2.15 
Cc 1 2,756.25 2,756.25 8.45 
AB 1 210.25 210.25 64 
AC 1 30.25 30.25 .093 
BC 1 25 25 .077 
Blocks 3 5,204 1,734.67 332 
Error 6 1,958 326.33 
Total 15 23,882 

a 


Confounding Using More than Two Blocks 


In the case r = 2 (four blocks), three effects are confounded with blocks. The 
experimenter first chooses two defining effects to be confounded. For example, in 
a five-factor experiment (A, B, C, D, and E), the two three-factor interactions BCD 
and CDE might be chosen for confounding. The third effect confounded is then 
the generalized interaction of the two, obtained by writing the two chosen effects 
side by side and then cancelling any letters common to both: (BCD)(CDE) = BE. 
Notice that if ABC and CDE are chosen for confounding, their generalized interac- 
tion is (ABC)(CDE) = ABDE, so that no main effects or two-factor interactions are 
confounded. 

Once the two defining effects have been selected for confounding, one block 
consists of all treatment conditions having an even number of letters in common with 
both defining effects. The second block consists of all conditions having an even 
number of letters in common with the first defining contrast and an odd number of 
letters in common with the second contrast, and the third and fourth blocks consist 
of the “odd/even” and “odd/odd” contrasts. In a five-factor experiment with defining 
effects ABC and CDE, this results in the allocation to blocks as shown in Figure 11.14 
(with the number of letters in common with each defining contrast appearing beside 
each experimental condition). 


Block 1 Block 2 Block 3 Block 4 
(1) (0, 0) d (0, 1) a (1, 0) c (i, 1) 
ab (2, 0) e (0, 1) b (1, 0) ad (i, 1) 
de (0, 2) ac (2, 1) cd (1, 2) ae (i, 1) 
acd (2, 2) be (2, 1) ce (1, 2) bd (i, 1) 
ace (2, 2) abd (2, 1) ade (1, 2) be dd, 1) 
bcd (2, 2) abe (2, 1) bde (1, 2) abc (3, 1) 
bce (2, 2) acde (2, 3) abcd (3,2) cde (1, 3) 
abde_ (2,2) bcde (2, 3) abce (3,2) abcde (3, 3) 


Figure 11.14 Four blocks in a 2° factorial experiment with defining effects ABC and CDE 


The block containing (1) is called the principal block. Once it has been con- 
structed, a second block can be obtained by selecting any experimental condition not 
in the principal block and obtaining its generalized interaction with every condition 
in the principal block. The other blocks are then constructed in the same way by 
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first selecting a condition not in a block already constructed and finding generalized 
interactions with the principal block. 

For experimental situations with p > 3, there is often no replication, so sums of 
squares associated with nonconfounded higher-order interactions are usually pooled to 
obtain an error sum of squares that can be used in the denominators of the various F 
statistics. All computations can again be carried out using Yates’s technique, with SSB 
being the sum of sums of squares associated with confounded effects. 

When r > 2, one first selects r defining effects to be confounded with blocks, 
making sure that no one of the effects chosen is the generalized interaction of any 
other two selected. The additional 2” — r — 1 effects confounded with the blocks are 
then the generalized interactions of all effects in the defining set (including not only 
generalized interactions of pairs of effects but also of sets of three, four, and so on). 


Fractional Replication 


When the number of factors p is large, even a single replicate of a 2”? experiment can 
be expensive and time consuming. For example, one replicate of a 2° factorial exper- 
iment involves an observation for each of the 64 different experimental conditions. 
An appealing strategy in such situations is to make observations for only a fraction 
of the 2” conditions. Provided that care is exercised in the choice of conditions to be 
observed, much information about factor effects can still be obtained. 

Suppose we decide to include only 2?~! (half) of the 2? possible conditions in 
our experiment; this is usually called a half-replicate. The price paid for this 
economy is twofold. First, information about a single effect (determined by the 2?~! 
conditions selected for observation) is completely lost to the experimenter in the 
sense that no reasonable estimate of the effect is possible. Second, the remaining 
2? — 2 main effects and interactions are paired up so that any one effect in a particular 
pair is confounded with the other effect in the same pair. For example, one such pair 
may be {A, BCD}, so that separate estimates of the A main effect and BCD interac- 
tion are not possible. It is desirable, then, to select a half-replicate for which main 
effects and low-order interactions are paired off (confounded) only with higher-order 
interactions rather than with one another. 

The first step in specifying a half-replicate is to select a defining effect as the 
nonestimable effect. Suppose that in a five-factor experiment, ABCDE is chosen as 
the defining effect. Now the 2° = 32 possible treatment conditions are divided into 
two groups with 16 conditions each, one group consisting of all conditions having 
an odd number of letters in common with ABCDE and the other containing an even 
number of letters in common with the defining contrast. Then either group of 16 
conditions is used as the half-replicate. The “odd” group is 


a, b, c, d, e, abc, abd, abe, acd, ace, ade, bcd, bce, bde, cde, abcde 


Each main effect and interaction other than ABCDE is then confounded with 
(aliased with) its generalized interaction with ABCDE. Thus (AB)(ABCDE) = CDE, 
so the AB interaction and CDE interaction are confounded with each other. The 
resulting alias pairs are 


{A, BCDE} {B, ACDE} {C, ABDE} {D, ABCE} {E, ABCD} 

{AB, CDE} {AC, BDE} {AD, BCE} {AE, BCD} {BC, ADE} 

{BD, ACE} {BE, ACD} {CD, ABE} {CE, ABD} {DE, ABC} 
Note in particular that every main effect is aliased with a four-factor interaction. 


Assuming these interactions to be negligible allows us to test for the presence of 
main effects. 
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To specify a quarter-replicate of a 2? factorial experiment (2?~* of the 2? 
possible treatment conditions), two defining effects must be selected. These two 
and their generalized interaction become the nonestimable effects. Instead of 
alias pairs as in the half-replicate, each remaining effect is now confounded with 
three other effects, each being its generalized interaction with one of the three 
nonestimable effects. 


EXAMPLE 11.16 The article “More on Planning Experiments to Increase Research Efficiency” 
Undustrial and Eng. Chemistry, 1970: 60-65) reports on the results of a quarter- 
replicate of a 2° experiment in which the five factors were A = condensation tem- 
perature, B = amount of material B, C = solvent volume, D = condensation time, 
and E = amount of material E. The response variable was the yield of the chemical 
process. The chosen defining contrasts were ACE and BDE, with generalized inter- 
action (ACE)(BDE) = ABCD. The remaining 28 main effects and interactions can 
now be partitioned into seven groups of four effects each, such that the effects within 
a group cannot be assessed separately. For example, the generalized interactions 
of A with the nonestimable effects are (A)(ACE) = CE, (A)(BDE) = ABDE, and 
(A)(ABCD) = BCD, so one alias group is {A, CE, ABDE, BCD}. The complete set of 
alias groups is 


{A, CE, ABDE, BCD} {B, ABCE,DE,ACD}_{C, AE, BCDE, ABD} 
{D, ACDE, BE, ABC} {E, AC, BD, ABCDE}_—{AB, BCE, ADE, CD} 
{AD, CDE, ABE, BC} a 


Once the defining contrasts have been chosen for a quarter-replicate, they are used 
as in the discussion of confounding to divide the 2? treatment conditions into four 
groups of 2?~? conditions each. Then any one of the four groups is selected as the 
set of conditions for which data will be collected. Similar comments apply to a 1/2” 
replicate of a 2” factorial experiment. 

Having made observations for the selected treatment combinations, a table of 
signs similar to Table 11.11 is constructed. The table contains a row only for each of 
the treatment combinations actually observed rather than the full 2? rows, and there 
is a single column for each alias group (since each effect in the group would have the 
same set of signs for the treatment conditions selected for observation). The signs in 
each column indicate as usual how contrasts for the various sums of squares are com- 
puted. Yates’s method can also be used, but the rule for arranging observed conditions 
in standard order must be modified. 

The difficult part of a fractional replication analysis typically involves 
deciding what to use for error sum of squares. Since there will usually be no 
replication (though one could observe, e.g., two replicates of a quarter-replicate), 
some effect sums of squares must be pooled to obtain an error sum of squares. In 
a half-replicate of a 2° experiment, for example, an alias structure can be chosen 
so that the eight main effects and 28 two-factor interactions are each confounded 
only with higher-order interactions and that there are an additional 27 alias 
groups involving only higher-order interactions. Assuming the absence of higher- 
order interaction effects, the resulting 27 sums of squares can then be added to 
yield an error sum of squares, allowing 1 df tests for all main effects and two- 
factor interactions. However, in many cases tests for main effects can be obtained 
only by pooling some or all of the sums of squares associated with alias groups 
involving two-factor interactions, and the corresponding two-factor interactions 
cannot be investigated. 
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EXAMPLE 11.17 The set of treatment conditions chosen and resulting yields for the quarter-replicate 


(Example 11.16 of the 2° experiment were 
continued) 
e ab ad be cd ace bde abcde 
23.2 15.5 16.9 16.2 23.8 23.4 16.8 18.1 


The abbreviated table of signs is displayed in Table 11.16. 
With SSA denoting the sum of squares for effects in the alias group {A, CE, 
ABDE, BCD}, 
(—23.2 + 15.5 + 16.9 — 16.2 — 23.8 + 23.4 — 16.8 + 18.1)? 


SSA = = 4.65 
8 


Table 11.16 Table of Signs for Example 11.17 


Similarly, SSB = 53.56, SSC = 10.35, SSD = .91 SSE’ = 10.35 (the ' differenti- 
ates this quantity from error sum of squares SSE), SSAB = 6.66, and SSAD = 3.25, 
giving SST = 4.65 + 53.56 + --- + 3.25 = 89.73. To test for main effects, we 
use SSE = SSAB + SSAD = 9.91 with 2 df. The ANOVA table is in Table 11.17. 


Table 11.17 ANOVA Table for Example 11.17 


Source df Sum of Squares Mean Square f 
A 1 4.65 4.65 94 
B 1 53.56 53.56 10.80 
G 1 10.35 10.35 2.09 
D 1 91 91 18 
E 1 10.35 10.35 2.09 
Error 2 9.91 4.96 

Total 7 89.73 


Since Fos. = 18.51, none of the five main effects can be judged significant. 
Of course, with only 2 df for error, the test is not very powerful (i.e., it is quite likely 
to fail to detect the presence of effects). The article in Industrial and Engineering 
Chemistry from which the data came actually has an independent estimate of the 
standard error of the treatment effects based on prior experience, so it used a some- 
what different analysis. Our analysis was done here only for illustrative purposes, 
since one would ordinarily want many more than 2 df for error. is 


As an alternative to F tests based on pooling sums of squares to obtain SSE, 
a normal probability plot of effect contrasts can be examined. 
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EXAMPLE 11.18 Anexperiment was carried out to investigate shrinkage in the plastic casing material used 
for speedometer cables (“‘An Explanation and Critique of Taguchi’s Contribution to 
Quality Engineering,’ Quality and Reliability Engr. Intl., 1988: 123-131). The engi- 
neers started with 15 factors: liner outside diameter, liner die, liner material, liner line 
speed, wire braid type, braiding tension, wire diameter, liner tension, liner temperature, 
coating material, coating die type, melt temperature, screen pack, cooling method, and 
line speed. It was suspected that only a few of these factors were important, so a screening 
experiment in the form of a 2!>~!! factorial (a 1/2!! fraction of a 2!> factorial experiment) 
was carried out. The resulting alias structure is quite complicated; in particular, every main 
effect is confounded with two-factor interactions. The response variable was the percent- 
age of shrinkage for a cable specimen produced at designated levels of the factors. 

Figure 11.15 displays a normal probability plot of the effect contrasts. All but 
two of the points fall quite close to a straight line. The discrepant points correspond 
to effects E = wire braid type and G = wire diameter, suggesting that these two fac- 
tors are the only ones that affect the amount of shrinkage. 


Contrast 
A 
0 4 
= 8 4 
© G= Wire diameter 
-16-7 
e E = Wire-braid type 
T T T T T > zpercentile 
—1.6 —8 0 8 1.6 
Figure 11.15 Normal probability plot of contrasts from Example 11.18 ta 


The subjects of factorial experimentation, confounding, and fractional replica- 
tion encompass many models and techniques we have not discussed. Please consult 
the chapter references for more information. 


EXERCISES Section 11.4 (38-49) 


38. The accompanying data resulted from an experiment a. Verify that the sums of squares are as given in the 
to study the nature of dependence of welding current accompanying ANOVA table from Minitab. 
on three factors: welding voltage, wire feed speed, b. Which effects appear to be important, and why? 
and tip-to-workpiece distance. There were two levels 
of each factor (a 2? experiment) with two replications Analysis of Variance for current 
per combination of levels (the averages across repli- Source DF Ss MS FP 
cations agree with values given in the article “A Volt Hs reer feeeer ly nomen 92000 
Study on Prediction of Welding Current in Gas pec 2 Bhete.2 Aeneid Teseest UNO) 
Metal Arc Welding,” J. Engr. Manuf., 1991: 64-69). >*8* fee ee ee 

. : Volt*speed 1 36.6 36.6 2.22 0.174 

The first two given numbers are for the treatment (1), Nieiceaiee ; a oA ce er eee 
the next two for a, and so on in standard order: 200.0, Speed*dist 1 109.2 109.2 6.63 0.033 
204.2, 215.5, 219.5, 272.7, 276.9, 299.5, 302.7, Volt*speed*dist 1 23.5 23.5 1.43 0.266 
166.6, 172.6, 186.4, 192.0, 232.6, 240.8, 253.4, Error 8 131.7 16.5 
261.6. Total 15 28335.3 
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40. 


The accompanying data resulted from a 2? experiment 
with three replications per combination of treatments 
designed to study the effects of concentration of detergent 
(A), concentration of sodium carbonate (B), and concen- 
tration of sodium carboxymethyl cellulose (C) on the 
cleaning ability of a solution in washing tests (a larger 
number indicates better cleaning ability than a smaller 
number). 


Factor Levels 


A B Cc Condition Observations 
1 1 1 (1) 106, 93, 116 
2 1 1 a 198, 200, 214 
1 2 1 b 197, 202, 185 
2 2 1 ab 329, 331, 307 
1 1 2 c 149, 169, 135 
2 1 2 ac 243, 247, 220 
1 2 2 bc 255, 230, 252 
2 2 2 abc 383, 360, 364 


a. After obtaining cell totals x;,., compute estimates of 
By, vii and y5i°. 

b. Use the cell totals along with Yates’s method to com- 
pute the effect contrasts and sums of squares. Then 
construct an ANOVA table and test all appropriate 
hypotheses using a = .05. 

c. Suppose a low water temperature has been used to 
obtain the data. The entire experiment is then 
repeated with a higher water temperature to obtain 
the following data. Use Yates’s algorithm on the 
entire set of 48 observations to obtain the sums of 
squares and ANOVA table, and then test appropriate 
hypotheses at level .05. 


Condition Observations 
d 144, 154, 158 
ad 239, 227, 244 
bd 232, 242, 246 
abd 364, 362, 346 
cd 194, 162, 203 
acd 284, 295, 291 
bcd 291, 287, 297 
abcd 411, 406, 395 


In a study of processes used to remove impurities from 
cellulose goods (“Optimization of Rope-Range 
Bleaching of Cellulosic Fabrics,’ Textile Research J., 
1976: 493-496), the following data resulted from a 2* 
experiment involving the desizing process. The four factors 
were enzyme concentration (A), pH (B), temperature (C), 
and time (D). 
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Starch % 

by Weight 

Enzyme Temp. Time Ist 2nd 
Treatment (g/L) pH (°C) (hr) Repl. Repl. 
(1) 50 6.0 60.0 6 9.72 13.50 
a 75 6.0 60.0 6 9.80 14.04 
b 50 7.0 60.0 6 10.13 11.27 
ab 75 7.0 60.0 6 11.80 11.30 
c 50 6.0 70.0 6 12.70 11.37 
ac 75, 6.0 70.0 6 11.96 12.05 
be 50 7.0 70.0 6 11.38 9.92 
abc 15 7.0 70.0 6 11.80 11.10 
d 50 6.0 60.0 8 13.15 13.00 
ad 75 6.0 60.0 8 10.60 12.37 
bd 50 7.0 60.0 8 10.37 12.00 
abd 15 7.0 60.0 8 11.30 11.64 
cd 50 6.0 70.0 8 13.05 14.55 
acd AS 6.0 70.0 8 11.15 15.00 
bcd 50 7.0 70.0 8 12.70 14.10 
abcd 75 7.0 70.0 8 13.20 16.12 


41. 


a. Use Yates’s algorithm to obtain sums of squares and 
the ANOVA table. 

b. Do there appear to be any second-, third-, or fourth- 
order interaction effects present? Explain your rea- 
soning. Which main effects appear to be significant? 


As with many dried products, sun-dried tomatoes can 
exhibit an undesirable discoloration during the drying 
and storage process. A replicated 2° experiment was 
conducted in an effort to relate color quality to the fac- 
tors storage time, temperature, and packaging type (“Use 
of Factorial Experimental Design for Analyzing the 
Effect of Storage Conditions on Color Quality of Sun- 
Dried Tomatoes,” Sci. Res. and Essays, 2012: 477- 
489). In the following table, higher values of the 
response variable (based on chromaticity measurements) 
are associated with higher color quality: 


Color Quality 
Storage Storage Replication Replication 
time temp Packaging 1 2 
= = = 2.38 2.40 
+ = = 2.38 2.40 
7 + - 2.42 2.40 
= 2.31 2.29 
= = + 2.38 2.40 
+ = + 2.38 2.40 
= + + 1.94 1.94 
+ 1.93 1.92 


42. 


Construct an ANOVA table and use it as a basis for 
deciding which effects appear to be significant. 


The following data on power consumption in electric- 
furnace heats (kW consumed per ton of melted 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


482 CHAPTER 11 Multifactor Analysis of Variance 


43. 


44. 


product) resulted from a 2+ factorial experiment with 
three replicates (“Studies on a 10-cwt Arc Furnace,” 
J. of the Iron and Steel Institute, 1956: 22). The fac- 
tors were nature of roof (A, low/high), power setting 
(B, low/high), scrap used (C, tube/plate), and charge 
(D, 700 1b/1000 Ib). 


Treat- Treat- 

ment Xitim ment Xiitetm 

(1) 866, 862, 800 d 988, 808, 650 
a 946, 800, 840 ad 966, 976, 876 
b 774, 834, 746 bd 702, 658, 650 
ab 709, 789, 646 abd 784, 700, 596 
c 1017, 990, 954 cd 922, 808, 868 
ac 1028, 906, 977 acd 1056, 870, 908 
bc 817, 783, 771 bcd 798, 726, 700 
abc 829, 806, 691 abcd 752, 714, 714 


Construct the ANOVA table, and test all hypotheses of 
interest using a = .O1. 


The article “Statistical Design and Analysis of 
Qualification Test Program for a Small Rocket 
Engine” (Industrial Quality Control, 1964: 14-18) 
presents data from an experiment to assess the effects 
of vibration (A), temperature cycling (B), altitude 
cycling (C), and temperature for altitude cycling and 
firing (D) on thrust duration. A subset of the data is 
given here. (In the article, there were four levels of D 
rather than just two.) Use the Yates method to obtain 
sums of squares and the ANOVA table. Then assume 
that three- and four-factor interactions are absent, 
pool the corresponding sums of squares to obtain an 
estimate of o”, and test all appropriate hypotheses at 
level .05. 


D, D, 
C, C, C, C, 
A B, 21.60 21.60 11.54 11.50 
1B, 21.09 22.17 11.14 11.32 
A B, 21.60 21.86 11.75 9.82 
2B, 19.57 21.85 11.69 11.18 


a. In a 2* experiment, suppose two blocks are to be 
used, and it is decided to confound the ABCD inter- 
action with the block effect. Which treatments 
should be carried out in the first block [the one con- 
taining the treatment (1)], and which treatments are 
allocated to the second block? 

b. In an experiment to investigate niacin retention in 
vegetables as a function of cooking temperature 
(A), sieve size (B), type of processing (C), and 
cooking time (D), each factor was held at two lev- 
els. Two blocks were used, with the allocation of 
blocks as given in part (a) to confound only the 
ABCD interaction with blocks. Use Yates’s proce- 
dure to obtain the ANOVA table for the accompa- 
nying data. 
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46. 


47. 


48. 


Treatment X ite Treatment X ite 
(1) 91 d 72 
a 85 ad 78 
b 92 bd 68 
ab 94 abd 79 
c 86 cd 69 
ac 83 acd 75 
be 85 bcd 72 
abc 90 abcd 71 


c. Assume that all three-way interaction effects are 
absent, so that the associated sums of squares can be 
combined to yield an estimate of a7, and carry out all 
appropriate tests at level .05. 


a. An experiment was carried out to investigate the 
effects on audio sensitivity of varying resistance (A), 
two capacitances (B, C), and inductance of a coil (D) 
in part of a television circuit. If four blocks were used 
with four treatments per block and the defining effects 
for confounding were AB and CD, which treatments 
appeared in each block? 

b. Suppose two replications of the experiment described 
in part (a) were performed, resulting in the accompa- 
nying data. Obtain the ANOVA table, and test all 
relevant hypotheses at level .01. 


Treat- Treat- 

ment Xijet Xijxr ment Xie Xie 
(1) 618 598 d 598 585 

a 583 560 ad 587 541 

b 477 525 bd 480 508 

ab 421 462 abd 462 449 

é 601 595 cd 603 577 

ac 550 589 acd S71 552 

be 505 484 bcd 502 508 

abc 452 451 abcd 449 455 


In an experiment involving four factors (A, B, C, and D) 
and four blocks, show that at least one main effect or 
two-factor interaction effect must be confounded with 
the block effect. 


a. In a seven-factor experiment (A,...,G), suppose a 
quarter-replicate is actually carried out. If the defin- 
ing effects are ABCDE and CDEFG, what is the 
third nonestimable effect, and what treatments are in 
the group containing (1)? What are the alias groups 
of the seven main effects? 

b. If the quarter-replicate is to be carried out using four 
blocks (with eight treatments per block), what are the 
blocks if the chosen confounding effects are ACF 
and BDG? 


The article “Applying Design of Experiments to 
Improve a Laser Welding Process” (J. of Engr. 
Manufacture, 2008: 1035-1042) included the results of 
a half replicate of a 2* experiment. The four factors were: 
A. Power (2900 W, 3300 W), B. Current (2400 mV, 


3600 mV), C. Laterals cleaning (No, Yes), and D. Roof 

cleaning (No, Yes). 

a. If the effect ABCD is chosen as the defining effect 
for the replicate and the group of eight treatments for 
which data is obtained includes treatment (1), what 
other treatments are in the observed group, and what 
are the alias pairs? 

b. The cited article presented data on two different 
response variables, the percentage of defective joints 
for both the right laser welding cord and the left 
welding cord. Here we consider just the latter 
response. Observations are listed here in standard 
order after deleting the half not observed. Assuming 
that two- and three-factor interactions are negligible, 
test at level .05 for the presence of main effects. Also 
construct a normal probability plot. 


8.936 9.130 4.314 7.692 
AIS 6.061 1.984 3.830 
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49. A half-replicate of a 2° experiment to investigate the 


effects of heating time (A), quenching time (B), drawing 
time (C), position of heating coils (D), and measurement 
position (£) on the hardness of steel castings resulted in 
the accompanying data. Construct the ANOVA table, and 
(assuming second and higher-order interactions to be 
negligible) test at level .01 for the presence of main 
effects. Also construct a normal probability plot. 


Treat- Treat- 

ment Observation ment Observation 
a 70.4 acd 66.6 

b 72.1 ace 67.5 

é 70.4 ade 64.0 

d 67.4 bed 66.8 

e 68.0 bce 70.3 

abc 73.8 bde 67.9 
abd 67.0 cde 65.9 

abe 67.8 abcde 68.0 


SUPPLEMENTARY EXERCISES (50-61) 


50. 


51. 


The results of a study on the effectiveness of line drying on 
the smoothness of fabric were summarized in the article 
“Line-Dried vs. Machine-Dried Fabrics: Comparison 
of Appearance, Hand, and Consumer Acceptance” 
(Home Econ. Research J., 1984: 27-35). Smoothness 
scores were given for nine different types of fabric and five 
different drying methods: (1) machine dry, (2) line dry, 
(3) line dry followed by 15-min tumble, (4) line dry with 
softener, and (5) line dry with air movement. Regarding 
the different types of fabric as blocks, construct an 
ANOVA table. Using a .05 significance level, test to see 
whether there is a difference in the true mean smoothness 
score for the drying methods. 


Drying Method 

1 2 3 4 5 

Crepe 3.3 2:5 2.8 2.5 1.9 
Double knit 3.6 2.0 3.6 2.4 2.3 
Twill 4.2 34 3.8 3.1 3.1 
Twill mix 34 24 29 16 1.7 
Fabric Terry 3.8 1.3 2.8 2.0 1.6 
Broadcloth 2.2) 1.5 27 15 1.9 
Sheeting 3:5. 2.1) 2.8 2.1 2.2 
Corduroy 3.6 1.3 2.8 1.7 1.8 
Denim 26 14 2.4 1.3 1.6 


The water absorption of two types of mortar used to 
repair damaged cement was discussed in the article 
“Polymer Mortar Composite Matrices for 
Maintenance-Free, Highly Durable Ferrocement” (J. 
of Ferrocement, 1984: 337-345). Specimens of ordinary 


52. 


cement mortar (OCM) and polymer cement mortar 
(PCM) were submerged for varying lengths of time (5, 9, 
24, or 48 hours) and water absorption (% by weight) was 
recorded. With mortar type as factor A (with two levels) 
and submersion period as factor B (with four levels), 
three observations were made for each factor level com- 
bination. Data included in the article was used to com- 
pute the sums of squares, which were SSA = 322.667, 
SSB =35.623, SSAB = 8.557, and SST = 372.113. Use 
this information to construct an ANOVA table. Test the 
appropriate hypotheses at a .05 significance level. 


Four plots were available for an experiment to compare 
clover accumulation for four different sowing rates 
(“Performance of Overdrilled Red Clover with 
Different Sowing Rates and Initial Grazing 
Managements,” N. Zeal. J. of Exp. Ag., 1984: 71-81). 
Since the four plots had been grazed differently prior to 
the experiment and it was thought that this might affect 
clover accumulation, a randomized block experiment 
was used with all four sowing rates tried on a section of 
each plot. Use the given data to test the null hypothesis 
of no difference in true mean clover accumulation (kg 
DM+/ha) for the different sowing rates. 


Sowing Rate (kg/ha) 


3.6 6.6 10.2 13.5 

1 1155 2255 3505 4632 

Plot 2 123 406 564 416 
3 68 416 662 379 

4 62 75 362 564 
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53. 


54. 


55. 


In an automated chemical coating process, the speed 
with which objects on a conveyor belt are passed through 
a chemical spray (belt speed), the amount of chemical 
sprayed (spray volume), and the brand of chemical used 
(brand) are factors that may affect the uniformity of the 
coating applied. A replicated 2? experiment was con- 
ducted in an effort to increase the coating uniformity. In 
the following table, higher values of the response vari- 
able are associated with higher surface uniformity: 


Surface 
Uniformity 

Repli- Repli- 

Spray Belt cation cation 
Run Volume’ Speed _ Brand 1 2 
1 — a — 40 36 
2 + a - 25 28 
3 = E = 30 32 
4 + + = 50 48 
5 = = + 45 43 
6 + - + 25 30 
7 7 30 29 
8 + 52 49 


Analyze this data and state your conclusions. 


Coal-fired power plants used in the electrical industry 
have gained increased public attention because of the 
environmental problems associated with solid wastes 
generated by large-scale combustion (‘‘Fly Ash Binders 
in Stabilization of FGD Wastes,” J. of Environmental 
Engineering, 1998: 43-49). A study was conducted to 
analyze the influence of three factors—binder type (A), 
amount of water (B), and land disposal scenario (C)— 
that affect certain leaching characteristics of solid wastes 
from combustion. Each factor was studied at two levels. 
An unreplicated 2? experiment was run, and a response 
value ECS0 (the effective concentration, in mg/L, that 
decreases 50% of the light in a luminescence bioassay) 
was measured for each combination of factor levels. The 
experimental data is given in the following table: 


Factor Response 
Run A B Cc ECS50 
1 =1 —1 -1 23,100 
2 1 —1 -1 43,000 
3 = 1 —1 71,400 
4 1 1 -1 76,000 
5 =i -1 1 37,000 
6 1 —1 1 33,200 
7 = 1 1 17,000 
8 1 1 1 16,500 


Carry out an appropriate ANOVA, and state your 
conclusions. 


Impurities in the form of iron oxides lower the economic 
value and usefulness of industrial minerals, such as kaolins, 
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to ceramic and paper-processing industries. A 2* experi- 
ment was conducted to assess the effects of four factors on 
the percentage of iron removed from kaolin samples 
(‘Factorial Experiments in the Development of a Kaolin 
Bleaching Process Using Thiourea in Sulphuric Acid 
Solutions,’ Hydrometallurgy, 1997: 181-197). The fac- 
tors and their levels are listed in the following table: 


Low High 
Factor Description Units Level Level 
A H,SO, M .10 25 
B Thiourea g/L 0.0 5.0 
Cc Temperature oC 70 90 
D Time min 30 150 


The data from an unreplicated 2* experiment is listed in 
the next table. 


Tron Tron 
Extraction Test Extraction 

Test Run (%) Run (%) 
(1) 7 d 28 
a ll ad 51 
b 7 bd 33 
ab 12 abd 57 
€ 21 cd 70 
ac 41 acd 95 
be 27 bcd 77 
abc 48 abcd 99 


a. Calculate estimates of all main effects and two-factor 
interaction effects for this experiment. 

b. Create a probability plot of the effects. Which effects 
appear to be important? 


Factorial designs have been used in forestry to assess the 
effects of various factors on the growth behavior of trees. 
In one such experiment, researchers thought that healthy 
spruce seedlings should bud sooner than diseased spruce 
seedlings (‘Practical Analysis of Factorial Experiments 
in Forestry,” Canadian J. of Forestry, 1995: 446-461). 
In addition, before planting, seedlings were also exposed 
to three levels of pH to see whether this factor has an effect 
on virus uptake into the root system. The following table 
shows data from a 2 X< 3 experiment to study both factors: 


pH 
Health Status 3 5:5 7 
Diseased 1.2, 1.4, .8, .6, 1.0, 1.0, 
1:05 1.2; .8, 1.0, 1.2, 1.4, 
1.4 8 1.2 
Healthy 1.4, 1.6, 1,0, 1.2. 1.2, 1.4, 
1.6, 1.6, 1.2, 1.4, 1,2, 1.2; 
1.4 14 1.4 


The response variable is an average rating of five buds 
from a seedling. The ratings are 0 (bud not broken), 


57. 


1 (bud partially expanded), and 2 (bud fully expanded). 
Analyze this data. 


One property of automobile air bags that contributes to 
their ability to absorb energy is the permeability (ft?/ft?/ 
min) of the woven material used to construct the air bags. 
Understanding how permeability is influenced by vari- 
ous factors is important for increasing the effectiveness 
of air bags. In one study, the effects of three factors, each 
at three levels, were studied (‘‘Analysis of Fabrics Used 
in Passive Restraint Systems—Airbags,” J. of the 
Textile Institute, 1996: 554-571): 


A (Temperature): 8°C, 50°C, 75°C 


59. 


Supplementary Exercises 485 


four cooking times, three concentrations, and two observa- 
tions at each combination of these levels. Calculated sums 
of squares are SSA = 6.94, SSB = 5.61, SSC = 12.33, 
SSAB = 4.05, SSAC = 7.32, SSBC = 15.80, SSE = 
14.40, and SST = 70.82. Construct the ANOVA table, and 
carry out appropriate tests at significance level .05. 


The bond strength when mounting an integrated circuit 
on a metalized glass substrate was studied as a function 
of factor A = adhesive type, factor B = curve time, and 
factor C = conductor material (copper and nickel). The 
data follows, along with an ANOVA table from Minitab. 
What conclusions can you draw from the data? 


B (Fabric denier): 420-D, 630-D, 840-D Cure Time 
C (Air pressure): 17.2 kPa, 34.4 kPa, 103.4 kPa Copper 1 2 3 
Temperature 8° 72.7 74.6 80.0 
1 80.0 TTS 82.7 
Pressure 77.8 78.5 84.6 
Denier 17.2 34.4 103.4 Adhesive 2 75.3 81.1 78.3 
420-D 2B 157 332 ee sie ig 
80 155 309 3 76.5 82.6 85.0 
630-D 35 91 288 Nickel 1 2 3 
43 98 271 
840-D 125 234 477 og on = 
1 33 464 1 77.4 78.2 74.6 
79.3 78.8 83.0 
Adhesive 2 77.8 75.4 83.9 
Temperature 50° 77.2 84.5 89.4 
Pressure 3 78.4 775 81.2 
Denier 17.2 34.4 103.4 Analysis of Variance for strength 
Source DF ists) MS F P 
420-D 52 125 281 Adhesive 2 101.317 50.659 6.54 0.007 
51 118 264 Curetime 2 151.317 75.659 9.76 0.001 
630-D 16 72 169 Conmater 1 0.722 0.722 0.09 0.764 
12 78 173 Adhes*curet 4 30.526 7.632 0.98 0.441 
840-D 90 149 338 Adhes*conm 2 8.015 4.008 0.52 0.605 
100 155 350 Curet*conm 2 5.952 2.976 0.38 0.687 
Adh*curet*conm 4 33.298 8.325 1.07 0.398 
Error 18 139.515 7.751 
Temperature 75° Total 35 470.663 
Pressure 60. The article “Effect of Cutting Conditions on Tool 
Denier 17.2 34.4 103.4 Performance in CBN Hard Turning” (J. of Manuf. 
Processes, 2005: 10-17) reported the accompanying 
420-D 37 95 276 data on cutting speed (m/s), feed (mm/rev), depth of cut 
31 106 281 (mm), and tool life (min). Carry out a three-factor 
630-D 30 91 213 ANOVA on tool life, assuming the absence of any factor 
41 100 211 interactions (as did the authors of the article). 
840-D 102 170 307 
98 160 311 Obs Cut spd Feed Cut dpth life 
1 1.2) 0.061 0.102 27.5 
Analyze this data and state your conclusions (assume 2 1.21 0.168 0.102 26.5 
that all factors are fixed). 3 1.21 0.061 0.203 27.0 
; 4 1.21 0.168 0.203 25.0 
58. A chemical engineer has carried out an experiment to study 5 3.05 0.061 0.102 8.0 
the effects of the fixed factors of vat pressure (A), cooking 6 3.05 0.168 0.102 5.0 
time of pulp (B), and hardwood concentration (C) on the 7 3.05 0.061 0.203 71.0 
strength of paper. The experiment involved two pressures, 8 3.05 0.168 0.203 3.5 
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61. Analogous to a Latin square, a Greco-Latin square design 
can be used when it is suspected that three extraneous 
factors may affect the response variable and all four fac- 
tors (the three extraneous ones and the one of interest) 
have the same number of levels. In a Latin square, each 
level of the factor of interest (C) appears once in each row 
(with each level of A) and once in each column (with 
each level of B). In a Greco-Latin square, each level of 
factor D appears once in each row, in each column, and 
also with each level of the third extraneous factor C. 
Alternatively, the design can be used when the four fac- 
tors are all of equal interest, the number of levels of each 
is N, and resources are available only for N* observations. 
A 5 X 5 square is pictured in (a), with (k, /) in each cell 
denoting the kth level of C and /th level of D. In (b) we 
present data on weight loss in silicon bars used for semi- 
conductor material as a function of volume of etch (A), 
color of nitric acid in the etch solution (B), size of bars 
(C), and time in the etch solution (D) (from “Applications 
of Analytic Techniques to the Semiconductor 
Industry,’ Fourteenth Midwest Quality Control 
Conference, 1959). 

Let xj denote the observed weight loss when factor 
A is at level i, B is at level j, C is at level k, and D is at 
level /. Assuming no interaction between factors, the total 
sum of squares SST (with N? — Idf) can be partitioned 
into SSA, SSB, SSC, SSD, and SSE. Give expressions 
for these sums of squares, including computing formulas, 
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obtain the ANOVA table for the given data, and test each 
of the four main effect hypotheses using a = .05. 


B 
(C, D)1 2 5 4 5 
1 |d,1) | @,3)| G5) } (4,2) |6,4 
2 | (2,2) | 3,4) | (41) ) 6,3) |, 5) 
A 3 | (3,3) | (4,5) | 6,2) | d.4) |, 1) 
4 | (4,4) | (5,1) | 4,3) ) (2,5) |G, 2) 
5 | (5,5) | ,2)| (2,4) B.D | 4, 3) 
(a) 


65 |} 82 |) 108 | 101 | 126 


84} 109 | 73) 97) 83 


105 | 129 | 89 | 89] 52 


119 | 72 | 76} 117} 84 


97 | 59 | 94} 78 | 106 


(b) 
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Simple Linear Regression 


and Correlation 


INTRODUCTION 


In the two-sample problems discussed in Chapter 9, we were interested in 
comparing values of parameters for the x distribution and the y distribution. 
Even when observations were paired, we did not try to use information about 
one of the variables in studying the other variable. This is precisely the objec- 
tive of regression analysis: to exploit the relationship between two (or more) 
variables so that we can gain information about one of them through knowing 
values of the other(s). 

Much of mathematics is devoted to studying variables that are determin- 
istically related. Saying that x and y are related in this manner means that once 
we are told the value of x, the value of y is completely specified. For example, 
consider renting a van for a day, and suppose that the rental cost is $25.00 
plus $.30 per mile driven. Letting x = the number of miles driven and y = the 
rental charge, then y = 25 + .3x. If the van is driven 100 miles (x = 100), then 
y = 25 + .3(100) = 55. As another example, if the initial velocity of a particle is Vo 
and it undergoes constant acceleration a, then distance traveled = y = Vox + 3ax?, 
where x = time. 

There are many variables x and y that would appear to be related to 
one another, but not in a deterministic fashion. A familiar example is given 
by variables x = high school grade point average (GPA) and y = college 
GPA. The value of y cannot be determined just from knowledge of x, and 
two different individuals could have the same x value but have very different 
y values. Yet there is a tendency for those who have high (low) high school 
GPAs also to have high (low) college GPAs. Knowledge of a student's high 
school GPA should be quite helpful in enabling us to predict how that person 


will do in college. 
487 
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Other examples of variables related in a nondeterministic fashion 
include x = age of achild and y=size of that child’s vocabulary, x = size 
of an engine (cm?) and y = fuel efficiency for an automobile equipped with 
that engine, and x = applied tensile force and y = amount of elongation in 
a metal strip. 

Regression analysis is the part of statistics that investigates the rela- 
tionship between two or more variables related in a nondeterministic fashion. 
In this chapter, we generalize the deterministic linear relation y = By + Bix 
to a linear probabilistic relationship, develop procedures for making various 
inferences based on the model, and obtain a quantitative measure (the cor- 
relation coefficient) of the extent to which the two variables are related. In 
Chapter 13, we will consider techniques for validating a particular model and 
investigate nonlinear relationships and relationships involving more than two 
variables. 


12.1 The Simple Linear Regression Model 


The simplest deterministic mathematical relationship between two variables x and y 
is a linear relationship y = By) + B,x. The set of pairs (x, y) for which y = By + Byx 
determines a straight line with slope B, and y-intercept By.* The objective of this 
section is to develop a linear probabilistic model. 

If the two variables are not deterministically related, then for a fixed value 
of x, there is uncertainty in the value of the second variable. For example, if we are 
investigating the relationship between age of child and size of vocabulary and decide 
to select a child of age x = 5.0 years, then before the selection is made, vocabulary 
size is arandom variable Y. After a particular 5-year-old child has been selected and 
tested, a vocabulary of 2000 words may result. We would then say that the observed 
value of Y associated with fixing x = 5.0 was y = 2000. 

More generally, the variable whose value is fixed by the experimenter will 
be denoted by x and will be called the independent, predictor, or explanatory 
variable. For fixed x, the second variable will be random; we denote this random 
variable and its observed value by Y and y, respectively, and refer to it as the 
dependent or response variable. 

Usually observations will be made for a number of settings of the inde- 
pendent variable. Let x,, x,,...,x,, denote values of the independent variable for 
which observations are made, and let Y; and y,, respectively, denote the random 
variable and observed value associated with x; The available bivariate data then 
consists of the n pairs (x, y,), (>, y2),---, %,, y,). A picture of this data called a 
scatterplot gives preliminary impressions about the nature of any relationship. 
In such a plot, each (x,, y;) is represented as a point plotted on a two-dimensional 
coordinate system. 


* The slope of a line is the change in y for a 1-unit increase in x. For example, if y = —3x + 10, then y 
decreases by 3 when x increases by 1, so the slope is —3. The y-intercept is the height at which the line 
crosses the vertical axis and is obtained by setting x = 0 in the equation. 
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EXAMPLE 12.1 


12.1 The Simple Linear Regression Model 489 
Visual and musculoskeletal problems associated with the use of visual display 
terminals (VDTs) have become rather common in recent years. Some research- 
ers have focused on vertical gaze direction as a source of eye strain and irritation. 
This direction is known to be closely related to ocular surface area (OSA), so a 
method of measuring OSA is needed. The accompanying representative data on 
y = OSA (cm?) and x = width of the palprebal fissure (i.e., the horizontal width 
of the eye opening, in cm) is from the article ‘Analysis of Ocular Surface Area 
for Comfortable VDT Workstation Layout’ (Ergonomics, 1996: 877-884). The 
order in which observations were obtained was not given, so for convenience they 
are listed in increasing order of x values. 


i 1 2 3 + 5 6 7 8 9 10 tt 12 #13 14 = «15 
x,| 40 42 48 51 57 60 .70 .75) 75 78) 840 95 
y; 11.02 1.21 88 .98 1.74 1.63 2.00 2.80 2.48 2.47 3.05 


99 1.03 1.12 


1.52 1.83 


i| 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
x; | 1.15 1.20 1.25 1.40 1.43 146 149 1.55 1.58 1.60 


1.25 1.28 1.30 


y,|3.18 3.76 3.68 3.82 3.21 4.27 3.75 4.10 4.18 3.77 4.34 4.21 4.92 


Thus (x,, y,) = (.40, 1.02), (x5, ys) = (.57, 1.52), and so on. A Minitab scatterplot is 
shown in Figure 12.1; we used an option that produced a dotplot of both the x values 
and y values individually along the right and top margins of the plot, which makes 
it easier to visualize the distributions of the individual variables (histograms or box- 
plots are alternative options). Here are some things to notice about the data and plot: 


e Several observations have identical x values yet different y values (e.g., 
Xg =X = .75, but yg = 1.80 and y, = 1.74). Thus the value of y is not 
determined solely by x but also by various other factors. 


e There is a strong tendency for y to increase as x increases. That is, larger values 
of OSA tend to be associated with larger values of fissure width—a positive 
relationship between the variables. 


§- ° 
45 5 nee 
3 | - ° ° . 
< 7 
7p) eo . 
O24 ‘ . 
. 8 . 
14 ° ° . 
i T Is oi I T 1 1 
04 06 08 10 12 14 16 
palwidth 
Figure 12.1 Scatterplot from Minitab for the data from Example 12.1, along with dotplots of x 
and y values 
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e It appears that the value of y could be predicted from x by finding a line that is rea- 
sonably close to the points in the plot (the authors of the cited article superimposed 
such a line on their plot). In other words, there is evidence of a substantial (though 
not perfect) linear relationship between the two variables. a 


The horizontal and vertical axes in the scatterplot of Figure 12.1 intersect at 
the point (0, 0). In many data sets, the values of x or y or the values of both variables 
differ considerably from zero relative to the range(s) of the values. For example, a 
study of how air conditioner efficiency is related to maximum daily outdoor tem- 
perature might involve observations for temperatures ranging from 80°F to 100°F. 
When this is the case, a more informative plot would show the appropriately labeled 
axes intersecting at some point other than (0, 0). 


EXAMPLE 12.2 Arsenic is found in many ground waters and some surface waters. Recent health 
effects research has prompted the Environmental Protection Agency to reduce 
allowable arsenic levels in drinking water so that many water systems are no longer 
compliant with standards. This has spurred interest in the development of methods to 
remove arsenic. The accompanying data on x = pH and y = arsenic removed (%) by 
a particular process was read from a scatterplot in the article “Optimizing Arsenic 
Removal During Iron Removal: Theoretical and Practical Considerations” (J. 
of Water Supply Res. and Tech., 2005: 545-560). 


x 7.01 7.11 7.12 7.24 7.94 7.94 8.04 8.05 8.07 


y 60 67 66 52 50 45 52 48 40 


x 8.90 8.94 8.95 8.97 8.98 9.85 9.86 9.86 9.87 
y 23 20 40 31 26 9 22 13 7 


Figure 12.2 shows two Minitab scatterplots of this data. In Figure 12.2(a), the soft- 
ware selected the scale for both axes. We obtained Figure 12.2(b) by specifying scal- 
ing for the axes so that they would intersect at roughly the point (0, 0). The second 
plot is much more crowded than the first one; such crowding can make it difficult 
to ascertain the general nature of any relationship. For example, curvature can be 
overlooked in a crowded plot. 


% removal % removal 
A A 
70 4 J 
% ” 8 
605 e 60 4 ry 
50 5 - he 50 4 * 
e e 
40 7 e e 40 4 ee 
305 ° 30 4 e 
re e 
a ° ' 20 + ; * 
104 . 104 : 
e e 
O-+, T T T T T > pH 0- T T T T > pH 
70 75 8.0 8.5 90 95 10.0 0 2 4 6 8 10 
(a) (b) 


Figure 12.2 Minitab scatterplots of data in Example 12.2 
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Large values of arsenic removal tend to be associated with low pH, a negative or 
inverse relationship. Furthermore, the two variables appear to be at least approxi- 
mately linearly related, although the points in the plot would spread out somewhat 
about any superimposed straight line (such a line appeared in the plot in the cited 
article). @ 


A Linear Probabilistic Model 


For the deterministic model y = 6) + B,x, the actual observed value of y is a linear 
function of x. The appropriate generalization of this to a probabilistic model assumes 
that the expected value of Y is a linear function of x, but that for fixed x the variable 
Y differs from its expected value by a random amount. 


DEFINITION The Simple Linear Regression Model 


There are parameters By, 8,, and a”, such that for any fixed value of the inde- 
pendent variable x, the dependent variable is a random variable related to x 
through the model equation 


Y=f—,+Bwte (12.1) 


The quantity € in the model equation is a random variable, assumed to be 
normally distributed with E(e) = 0 and V(e) = o°. 


The variable € is usually referred to as the random deviation or random 
error term in the model. Without e, any observed pair (x, y) would correspond to 
a point falling exactly on the line y = B,) + B,x, called the true (or population) 
regression line. The inclusion of the random error term allows (x, y) to fall either 
above the true regression line (when € > 0) or below the line (when € < 0). The 
points (x), y,),.... (x, y,) resulting from n independent observations will then be 
scattered about the true regression line, as illustrated in Figure 12.3. On occasion, 
the appropriateness of the simple linear regression model may be suggested by 
theoretical considerations (e.g., there is an exact linear relationship between the two 
variables, with € representing measurement error). Much more frequently, though, 
the reasonableness of the model is indicated by a scatterplot exhibiting a substantial 
linear pattern (as in Figures 12.1 and 12.2). 


(x1, ¥y) True regression line 
° ‘a Y= Bo thx 
e 


(X2, Ya) 


Figure 12.3 Points corresponding to observations from the simple linear regression model 
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Implications of the model equation (12.1) can best be understood with the aid of 
the following notation. Let x* denote a particular value of the independent variable x and 


[y..« = the expected (or mean) value of Y when x has value x* 


Oy.,« = the variance of Y when x has value x* 


Alternative notation is E(Y|x*) and V(Y|x*). For example, if x = applied stress 
(kg/mm)? and y = time-to-fracture (hr), then wy.., would denote the expected value 
of time-to-fracture when applied stress is 20 kg/mm”. If we think of an entire popu- 
lation of (x, y) pairs, then py.,. is the mean of all y values for which x = x*, and 
o}.,+» 18 a measure of how much these values of y spread out about the mean value. 
If, for example, x = age of a child and y = vocabulary size, then py.; is the average 
vocabulary size for all 5-year-old children in the population, and o}.; describes the 
amount of variability in vocabulary size for this part of the population. Once x is 
fixed, the only randomness on the right-hand side of the model equation (12.1) is 
in the random error e€, and its mean value and variance are 0 and o”, respectively, 
whatever the value of x. This implies that 


My.» = E(By + B,x* + €) = By + B.x* + Ele) = By + B,x* 
OF.» = ViBy + Bix* + ©) = ViBy + Byx*) + Vie) =O +07 =e 


Replacing x* in py... by x gives the relation wy., = By + B,x, which says that 
the mean value of Y, rather than Y itself, is a linear function of x. The true regression 
line y = By + Bx is thus the line of mean values; its height above any particular x 
value is the expected value of Y for that value of x. The slope 6, of the true regression 
line is interpreted as the expected change in Y associated with a 1-unit increase in the 
value of x. The second relation states that the amount of variability in the distribution 
of Y values is the same at each different value of x (homogeneity of variance). If the 
independent variable is vehicle weight and the dependent variable is fuel efficiency 
(mpg), then the model implies that the average fuel efficiency changes linearly with 
weight (presumable B, is negative) and that the amount of variability in efficiency 
for any particular weight is the same as at any other weight. Finally, for fixed x, Y is 
the sum of a constant B, + 6,x and a normally distributed rv € so itself has a normal 
distribution. These properties are illustrated in Figure 12.4. The variance parameter 


Normal, mean 0, 
a standard deviation ~ 


Bo +Bix3 
Bo +hix2 
+ 
Bo + Bim Line y =£) + 4x 
> x 
xy Xp X3 
(b) 


Figure 12.4 (a) Distribution of ; (b) distribution of Y for different values of x 
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o” determines the extent to which each normal curve spreads out about its mean 
value; roughly speaking, the value of o is the size of a typical deviation from the 
true regression line. An observed point (x, y) will almost always fall quite close to the 
true regression line when o is small, whereas observations may deviate considerably 
from their expected values (corresponding to points far from the line) when a is large. 


EXAMPLE 12.3 Suppose the relationship between applied stress x and time-to-failure y is 
described by the simple linear regression model with true regression line 
y = 65 — 1.2x and o = 8. Then for any fixed value x* of stress, time-to-failure 
has a normal distribution with mean value 65 — 1.2x* and standard deviation 8. 
In the population consisting of all (x, y) points, the magnitude of a typical 
deviation from the true regression line is about 8. For x = 20, Y has mean value 
My.o9 = 65 — 1.2(20) = 41, so 


50 — 41 
P(Y > 50 when x = 20) = az > a = | — O(1.13) = .1292 
Because [y.5; = 35, 
30 — 35 
P(Y > 50 when x = 25) = rfz> =| = | — O(1.88) = .0301 


These probabilities are illustrated as the shaded areas in Figure 12.5. 


y P(Y > 50 when x = 20) = .1292 


A I I 
| 
1 


(Y > 50 when x = 25) = .0301 


True regression line 
y = 65 —-1.2x 


20 25 


Figure 12.5 Probabilities based on the simple linear regression model 


Suppose that Y, denotes an observation on time-to-failure made with x = 25 
and Y, denotes an independent observation made with x = 24. Then Y, — Y, is nor- 
mally distributed with mean value E(Y, — Y,) = B, = —1.2, variance V(Y, — Y,) = 
o? + o? = 128, and standard deviation V128 = 11.314. The probability that Y, 
exceeds Y, is 


0 — (-1.2) 


P(Y, — Y,>0)=P\Z> 
pee 11.314 


= P(Z> 11) = .4562 


That is, even though we expected Y to decrease when x increases by | unit, it is not 
unlikely that the observed Y at x + | will be larger than the observed Y at x. ia 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


494 CHAPTER 12 Simple Linear Regression and Correlation 


EXERCISES Section 12.1 (1-11) 


1. 


The efficiency ratio for a steel specimen immersed in a 
phosphating tank is the weight of the phosphate coating 
divided by the metal loss (both in mg/ft”). The article 
“Statistical Process Control of a Phosphate Coating 
Line” (Wire J. Intl., May 1997: 78-81) gave the 
accompanying data on tank temperature (x) and effi- 
ciency ratio (y). 


Temp.|170 172 173 174 «174 175 176 
Ratio | .84 1.31 142 1.03 1.07 1.08 1.04 


Temp. | 177 180 180 180 180 §6180 181 
Ratio 1.80 145 1.60 1.61 2.13 2.15 .84 
Temp. | 181 182 182 182 182 184 184 
Ratio | 1.43 90 1.81 194 2.68 1.49 2.52 
Temp.| 185 186 188 
Ratio |3.00 1.87 3.08 


a. Construct stem-and-leaf displays of both tempera- 
ture and efficiency ratio, and comment on interesting 
features. 

b. Is the value of efficiency ratio completely and unique- 
ly determined by tank temperature? Explain your 
reasoning. 

c. Construct a scatterplot of the data. Does it appear 
that efficiency ratio could be very well predicted by 
the value of temperature? Explain your reasoning. 


The article “Exhaust Emissions from Four-Stroke 
Lawn Mower Engines” (J. of the Air and Water Mgmnt. 
Assoc., 1997: 945-952) reported data from a study in 
which both a baseline gasoline mixture and a reformulat- 
ed gasoline were used. Consider the following observa- 
tions on age (yr) and NO, emissions (g/kWh): 


Engine 1 2 3 4 2) 
Age 0 0 2 11 7 
Baseline 172 438 4.06 1.26 5.31 
Reformulated 188 593 5.54 2.67 6.53 
Engine 6 7 8 9 10 
Age 16 9 0 12 4 
Baseline 57 3.37 3.44 74 1.24 


Reformulated 74 4.94 4.89 69 1.42 


Construct scatterplots of NO, emissions versus age. What 
appears to be the nature of the relationship between these 
two variables? [Note: The authors of the cited article 
commented on the relationship.] 


Bivariate data often arises from the use of two different 
techniques to measure the same quantity. As an example, 
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5. 


the accompanying observations on x = hydrogen con- 
centration (ppm) using a gas chromatography method 
and y = concentration using a new sensor method were 
read from a graph in the article ‘‘“A New Method to 
Measure the Diffusible Hydrogen Content in Steel 
Weldments Using a Polymer Electrolyte-Based 


Hydrogen Sensor’ (Welding Res., July 1997: 
251s—256s). 

x| 47 62 65 70 70 78 95 100 114 118 
y!| 38 62 53 67 84 79 93 106 117 116 
x |124 127 140 140 140 150 152 164 198 221 
y 1127 114 134 139 142 170 149 154 200 215 


Construct a scatterplot. Does there appear to be a very 
strong relationship between the two types of concentra- 
tion measurements? Do the two methods appear to be 
measuring roughly the same quantity? Explain your 
reasoning. 


The accompanying data on y = ammonium concentra- 
tion (mg/L) and x = transpiration (ml/h) was read from 
a graph in the article “Response of Ammonium 
Removal to Growth and Transpiration of Juncus 
effusus During the Treatment of Artificial Sewage in 
Laboratory-Scale Wetlands” (Water Research, 2013: 
4265-4273). The article’s abstract stated “a linear cor- 
relation between the ammonium concentration inside 
the rhizosphere and the transpiration of the plant stocks 
implies that an influence of plant physiological activity 
on the efficiency of N-removal exists.” (The rhizo- 
sphere is the narrow region of soil at the plant root—soil 
interface, and transpiration is the process of water 
movement through a plant and its evaporation.) The 
article reported summary quantities from a simple lin- 
ear regression analysis. Based on a scatterplot, how 
would you describe the relationship between the vari- 
ables, and does simple linear regression appear to be an 
appropriate modeling strategy? 


x 5.8 88 11.0 136 185 21.0 23.7 
y 7.8 8.2 6.9 5.3. 4.7 49 4.3 
x | 260 283 31.9 365 382 404 

y 2.7 2.8 1.8 1.9 1.1 4 


The article “Objective Measurement of the Stretchability 
of Mozzarella Cheese” (J. of Texture Studies, 1992: 
185-194) reported on an experiment to investigate how the 
behavior of mozzarella cheese varied with temperature. 
Consider the accompanying data on x = temperature and 
y = elongation(%) at failure of the cheese. [Note: The 


researchers were Italian and used real mozzarella cheese, 
not the poor cousin widely available in the United States.] 


x | 59 63 68 72 74 78 83 


y | 118 182 247 208 = 197 135 132 


a. Construct a scatterplot in which the axes intersect 
at (0, 0). Mark 0, 20, 40, 60, 80, and 100 on the 
horizontal axis and 0, 50, 100, 150, 200, and 250 
on the vertical axis. 

b. Construct a scatterplot in which the axes intersect 
at (55, 100), as was done in the cited article. Does 
this plot seem preferable to the one in part (a)? 
Explain your reasoning. 

c. What do the plots of parts (a) and (b) suggest about 
the nature of the relationship between the two 
variables? 


One factor in the development of tennis elbow, a malady 
that strikes fear in the hearts of all serious tennis players, 
is the impact-induced vibration of the racket-and-arm 
system at ball contact. It is well known that the likeli- 
hood of getting tennis elbow depends on various proper- 
ties of the racket used. Consider the scatterplot of x = 

racket resonance frequency (Hz) and y = sum of peak- 
to-peak acceleration (a characteristic of arm vibration, 
in m/sec/sec) for n = 23 different rackets (‘“Iransfer of 
Tennis Racket Vibrations into the Human Forearm,” 
Medicine and Science in Sports and Exercise, 1992: 
1134-1140). Discuss interesting features of the data and 
scatterplot. 


y 


100 110 120 


130 140 150 160 170 180 190 


The article “Some Field Experience in the Use of an 
Accelerated Method in Estimating 28-Day Strength 
of Concrete” (J. of Amer. Concrete Institute, 1969: 
895) considered regressing y = 28-day standard-cured 
strength (psi) against x = accelerated strength (psi). 
Suppose the equation of the true regression line is 
y = 1800 + 1.3x. 


10. 


11. 
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a. What is the expected value of 28-day strength when 
accelerated strength = 2500? 

b. By how much can we expect 28-day strength to 
change when accelerated strength increases by 1 psi? 

c. Answer part (b) for an increase of 100 psi. 

d. Answer part (b) for a decrease of 100 psi. 


Referring to Exercise 7, suppose that the standard devia- 

tion of the random deviation € is 350 psi. 

a. What is the probability that the observed value of 
28-day strength will exceed 5000 psi when the value 
of accelerated strength is 2000? 

b. Repeat part (a) with 2500 in place of 2000. 

c. Consider making two independent observations on 
28-day strength, the first for an accelerated strength 
of 2000 and the second for x = 2500. What is the 
probability that the second observation will exceed 
the first by more than 1000 psi? 

d. Let Y, and Y, denote observations on 28-day strength 
when x = x, and x = x,, respectively. By how much 
would x, have to exceed x, in order that P(Y, > Y,) = 
95? 


The flow rate y (m?/min) in a device used for air-quality 
measurement depends on the pressure drop x (in. of 
water) across the device’s filter. Suppose that for x values 
between 5 and 20, the two variables are related according 
to the simple linear regression model with true regres- 
sion line y = —.12 + .095x. 

a. What is the expected change in flow rate associated 
with a 1-in. increase in pressure drop? Explain. 

b. What change in flow rate can be expected when pres- 
sure drop decreases by 5 in.? 

c. What is the expected flow rate for a pressure drop of 
10 in.? A drop of 15 in.? 

d. Suppose o = .025 and consider a pressure drop of 
10 in. What is the probability that the observed value 
of flow rate will exceed .835? That observed flow rate 
will exceed .840? 

e. What is the probability that an observation on flow 
rate when pressure drop is 10 in. will exceed an 
observation on flow rate made when pressure drop is 
11 in.? 


Suppose the expected cost of a production run is related to 
the size of the run by the equation y = 4000 + 10x. Let Y 
denote an observation on the cost of a run. If the variables’ 
size and cost are related according to the simple linear 
regression model, could it be the case that P(Y > 5500 
when x = 100) = .05 and P(Y > 6500 when x = 200) = 
.10? Explain. 


Suppose that in a certain chemical process the reaction 
time y (hr) is related to the temperature (°F) in the 
chamber in which the reaction takes place according to 
the simple linear regression model with equation y = 
5.00 — .01x and o = .075. 
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a. What is the expected change in reaction time for a 


1°F increase in temperature? For a 10°F increase in 
temperature? 


What is the expected reaction time when temperature 
is 200°F? When temperature is 250°F? 


Suppose five observations are made independently on 


What is the probability that all five times are between 
2.4 and 2.6 hr? 

What is the probability that two independently observed 
reaction times for temperatures 1° apart are such that 
the time at the higher temperature exceeds the time at 
the lower temperature? 


reaction time, each one for a temperature of 250°F. 


12.2 Estimating Model Parameters 


We will assume in this and the next several sections that the variables x and y are 
related according to the simple linear regression model. The values of Bo, B,, and 
o” will almost never be known to an investigator. Instead, sample data consisting 
of n observed pairs (x, y,),..-, (X,» ),,) Will be available, from which the model 
parameters and the true regression line itself can be estimated. These observa- 
tions are assumed to have been obtained independently of one another. That is, 
y, is the observed value of Y;, where Y,; = By) + 6,x; + €; and the n deviations 
€,, €,...,€, are independent rv’s. Independence of Yj, Y,,..., Y, follows from 
independence of the e,’s. 

According to the model, the observed points will be distributed about the true 
regression line in a random manner. Figure 12.6 shows a typical plot of observed 
pairs along with two candidates for the estimated regression line. Intuitively, the line 
y = da) + a,x is not a reasonable estimate of the true line y = B) + Bx because, if 
y = dy + a,x were the true line, the observed points would almost surely have been 
closer to this line. The line y = by + b,x is a more plausible estimate because the 
observed points are scattered rather closely about this line. 


y= + ax 
> x 


Figure 12.6 Two different estimates of the true regression line 


Figure 12.6 and the foregoing discussion suggest that our estimate of 
y = By + B,x should be a line that provides in some sense a best fit to the observed 
data points. This is what motivates the principle of least squares, which can be traced 
back to the German mathematician Gauss (1777-1855). According to this principle, 
a line provides a good fit to the data if the vertical distances (deviations) from the 
observed points to the line are small (see Figure 12.7). The measure of the goodness 
of fit is the sum of the squares of these deviations. The best-fit line is then the one 
having the smallest possible sum of squared deviations. 
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y 
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R=) y=bo t+ bx 
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ia 


10 20 30 8640 
Applied stress (kg/mm?) 


Figure 12.7 Deviations of observed data from line y = b, + b,x 


Principle of Least Squares 
The vertical deviation of the point (x,, y,) from the line y = by + b,x is 
height of point — height of line = y; — (by + b,x,) 


The sum of squared vertical deviations from the points (x,, y,),..., (%, y,) to 
the line is then 


fib by) = ZW Go + bx)P 
= 
The point estimates of By and B,, denoted by Bo and B, and called the least 
squares estimates, are those values that minimize f(b), b,). That is, By 


and B, are such that fBo B,) <= f(bo, b,) for any b, and b,. The estimated 
regression line or least squares line is then the line whose equation is 


y= Bo + Byx- 


The minimizing values of b, and b, are found by taking partial derivatives of 
f(bo, b,) with respect to both b, and b,, equating them both to zero [analogously to 
f(b) = 0 in univariate calculus], and solving the equations 


af( eae b,) 


= 520, — by — bx) (-1) = 0 


of hs bi) 


= 31207, — by — b,x) (—x) = 0 


Cancellation of the —2 factor and rearrangement gives the following system of equa- 
tions, called the normal equations: 


nby + (Sx)b, = Si, 
(Six) 1° (Sx7)d, = D0; 


These equations are linear in the two unknowns J, and b,. Provided that not all x,’s 


are identical, the least squares estimates are the unique solution to this system. 
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PROPOSITION The least squares estimate of the slope coefficient 8, of the true regression line is 

in De = X)Q; ot y) ee 

b, es By a = & 
> (x; — x)? oa 


(12.2) 


Computing formulas for the numerator and denominator of B, are 
Sa. > > a (Sx) DXy)/n S25 Se = (Sx) /n 


The least squares estimate of the intercept 6, of the true regression line is 


as 2S) apes ae ee 
n 


bo = Bo Sy fore (1223) 


The computational formulas for S,, and S,.. require only the summary statistics Xx;, 
Dy, =x?, and =x,y, (Ly? will be needed shortly). In computing Bos use extra digits 
in B, because, if x is large in magnitude, rounding will affect the final answer. In 
practice, the use of a statistical software package is preferable to hand calculation 
and hand-drawn plots. Once again, be sure that the scatterplot shows a linear pat- 
tern with relatively homogenous variation before fitting the simple linear regression 
model. 


EXAMPLE 12.4. The cetane number is a critical property in specifying the ignition quality of a fuel 
used in a diesel engine. Determination of this number for a biodiesel fuel is expen- 
sive and time-consuming. The article “Relating the Cetane Number of Biodiesel 
Fuels to Their Fatty Acid Composition: A Critical Study” (J. of Automobile 
Engr., 2009: 565-583) included the following data on x = iodine value (g) and 
y = cetane number for a sample of 14 biofuels. The iodine value is the amount of 
iodine necessary to saturate a sample of 100 g of oil. The article’s authors fit the 
simple linear regression model to this data, so let’s follow their lead. 


x | 132.0 129.0 120.0 113.2 105.0 92.0 84.0 83.2 88.4 59.0 80.0 81.5 71.0 69.2 
y | 46.0 48.0 51.0 52.1 54.0 52.0 59.0 58.7 61.6 64.0 61.4 54.6 58.8 58.0 


The necessary summary quantities for hand calculation can be obtained by placing 
the x values in a column and the y values in another column and then creating col- 
umns for x”, xy, and y” (these latter values are not needed at the moment but will be 
used shortly). Calculating the column sums gives =x; = 1307.5, 2y, = 779.2, =x? = 
128,913.93, =x,y, = 71,347.30, 2y? = 43,745.22, from which 


Si, = 128,913.93 — (1307.5)?/14 = 6802.7693 
Sg Pett e0 = (1307.5)(779.2)/14 = —1424.41429 
The estimated slope of the true regression line (i.e., the slope of the least squares 
line) is 
3, = Sw  —1424.41429 
* § 6802.7693 


XX 


.20938742 
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We estimate that the expected change in true average cetane number associated 
with a | g increase in iodine value is —.209—i.e., a decrease of .209. Since 
X = 93.392857 and y = 55.657143, the estimated intercept of the true regression 
line (i.e., the intercept of the least squares line) is 


By = ¥ — Bx = 55.657143 — (—.20938742)(93.392857) = 75.212432 


The equation of the estimated regression line (least squares line) is y= 
75.212 — .2094x, exactly that reported in the cited article. Figure 12.8 displays a 
scatterplot of the data with the least squares line superimposed. This line provides 
a very good summary of the relationship between the two variables. 


cet num = 75.21 — 0.2094 iod val 


65 4 


60 5 


39:4 


cet num 


50 


45 4 


T T T T T T T T T T 
50 60 70 80 90 100 110 120 130 140 
iod val 


Figure 12.8 Scatterplot for Example 12.4 with least squares line superimposed, from 
Minitab a 


The estimated regression line can immediately be used for two different 
purposes. For a fixed x value x*, Bo + Byx* (the height of the line above x*) gives 
either (1) a point estimate of the expected value of Y when x = x* or (2) a point 
prediction of the Y value that will result from a single new observation made at 
x=x*, 


EXAMPLE 12.5 Refer back to the iodine value—cetane number scenario described in the previous 
example. The estimated regression equation was y = 75.212 — .2094x. A point 
estimate of true average cetane number for all biofuels whose iodine value is 100 is 


fey.100 = Bo + B,(100) = 75.212 — .2094(100) = 54.27 


If a single biofuel sample whose iodine value is 100 is to be selected, 54.27 is also 
a point prediction for the resulting cetane number. B 


The least squares line should not be used to make a prediction for an x value 
much beyond the range of the data, such as x = 40 or x = 150 in Example 12.4. The 
danger of extrapolation is that the fitted relationship (a line here) may not be valid 
for such x values. 
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Estimating o? and o 


The parameter o? determines the amount of variability inherent in the regression 
model. A large value of o? will lead to observed (x,, y,)’s that are typically quite 
spread out about the true regression line, whereas when o” is small the observed 
points will tend to fall very close to the true line (see Figure 12.9). An estimate of o” 
will be used in confidence interval (CI) formulas and hypothesis-testing procedures 
presented in the next two sections. Because the equation of the true line is unknown, 
the estimate is based on the extent to which the sample observations deviate from 
the estimated line. Many large deviations (residuals) suggest a large value of 0, 
whereas deviations all of which are small in magnitude suggest that o” is small. 


y = Product sales 
y = Elongation 


r 


> > 


x = Tensile force x = Advertising expenditure 


(a) (b) 
Figure 12.9 Typical sample for o?: (a) small; (b) large 


DEFINITION The fitted (or predicted) values y,, y,,..., y, are obtained by successively 
substituting x,,...,x, into the equation of the estimated regression line: 
3, = Bo + Bixp 32 = Bo + Bixn---s 3, = By + Bix, The residuals are the 
differences y, — ¥,, Y. — Yo.--->Y_ — ¥, between the observed and fitted y 
values. 


In words, the predicted value }, is the value of y that we would predict or expect 
when using the estimated regression line with x = x;; y, is the height of the esti- 
mated regression line above the value x, for which the ith observation was made. 
The residual y, — 3, is the vertical deviation between the point (x,, y,) and the least 
squares line—a positive number if the point lies above the line and a negative 
number if it lies below the line. If the residuals are all small in magnitude, then 
much of the variability in observed y values appears to be due to the linear relationship 
between x and y, whereas many large residuals suggest quite a bit of inherent vari- 
ability in y relative to the amount due to the linear relation. Assuming that the line 
in Figure 12.7 is the least squares line, the residuals are identified by the vertical 
line segments from the observed points to the line. When the estimated regression 
line is obtained via the principle of least squares, the sum of the residuals should in 
theory be zero. In practice, the sum may deviate a bit from zero due to rounding. 


EXAMPLE 12.6 Japan’s high population density has resulted in a multitude of resource-usage 
problems. One especially serious difficulty concerns waste removal. The arti- 
cle “Innovative Sludge Handling Through Pelletization Thickening” (Water 
Research, 1999: 3245-3252) reported the development of a new compression 
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machine for processing sewage sludge. An important part of the investigation 
involved relating the moisture content of compressed pellets (y, in %) to the 
machine’s filtration rate (x, in kg-DS/m/hr). The following data was read from a 
graph in the article: 


x 125.3 98.2 2014 147.3 145.9 1247 112.2 120.2 161.2 178.9 


y 771.9 76.8 81.5 79.8 78.2 78.3 7715 77.0 80.1 80.2 


x 159.5 145.8 75.1 151.4 144.2 125.0 198.8 132.5 159.6 110.7 


y 79.9 79.0 76.7 78.2 19.5) 78.1 81.5 77.0 79.0 78.6 


Relevant summary quantities (summary statistics) are Xx; = 2817.9, Xy; = 1574.8, 
Dx? = 415,949.85, =x,y, = 222,657.88, and Ly? = 124,039.58, from which x = 
140.895, y = 78.74, S,, = 18,921.8295, and S,,, = 776.434. Thus 


7 776.434 
B= = .04103377 ~ .041 
18,921.8295 


Bo = 78.74 — (.04103377)(140.895) = 72.958547 =~ 72.96 


from which the equation of least squares line is y = 72.96 + .041x. For numerical 
accuracy, the fitted values are calculated from ), = 72.958547 + .04103377x;: 


¥, = 72.958547 + .04103377(125.3) ~ 78.100, y, — $, = —.200, etc. 


Nine of the 20 residuals are negative, so the corresponding nine points in a scat- 
terplot of the data lie below the estimated regression line. All predicted values (fits) 
and residuals appear in the accompanying table. 


Obs Filtrate Moistcon Fit Residual 
i 125.3 77.9 78.100 —0.200 
2 98.2 76.8 76.988 —0.188 
3 201.4 81.5 81.223 0.277 
4 147.3 79.8 79.003 0.797 
5 145.9 78.2 78.945 —0.745 
6 124.7 78.3 78.075 0.225 
7 112.2 775 77.563 —0.063 
8 120.2 77.0 77.891 —0.891 
9 161.2 80.1 79.573 0.527 

10 178.9 80.2 80.299 —0.099 

11 159.5 79.9 79.503 0.397 

12 145.8 79.0 78.941 0.059 

13 Tsk 76.7 76.040 0.660 

14 151.4 78.2 79.171 —0.971 

15 144.2 79.5 78.876 0.624 

16 125.0 78.1 78.088 0.012 

17 198.8 81.5 81.116 0.384 

18 132.5 77.0 78.396 — 1.396 

19 159.6 79.0 79.508 —0.508 

20 110.7 78.6 77.501 1.099 
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In much the same way that the deviations from the mean in a one-sample situ- 
ation were combined to obtain the estimate s? = X(x, — x )°/(n — 1), the estimate of 
o” in regression analysis is based on squaring and summing the residuals. We will 
continue to use the symbol s* for this estimated variance, so don’t confuse it with 
our previous s?. 


DEFINITION The error sum of squares (equivalently, residual sum of squares), denoted 
by SSE, is 


SSE = D0; = SOP = hy _ (Bo + Bix)? 


and the estimate of o? is 
SSE Gy) 
i= 2 i) = 2 


Co 2 


The divisor n — 2 in s? is the number of degrees of freedom (df) associated with SSE 
and the estimate s?. This is because to obtain s?, the two parameters B, and B, must 
first be estimated, which results in a loss of 2 df (just as wz had to be estimated in one- 
sample problems, resulting in an estimated variance based on n — 1| df). Replacing 
each y, in the formula for s? by the rv Y, gives the estimator S?. It can be shown that 
S? is an unbiased estimator for a? (though the estimator S is not unbiased for o). An 
interpretation of s here is similar to what we suggested earlier for the sample stand- 
ard deviation: Very roughly, it is the size of a typical vertical deviation within the 
sample from the estimated regression line. 


EXAMPLE 12.7 The residuals for the filtration rate-moisture content data were calculated previ- 
ously. The corresponding error sum of squares is 


SSE = (—.200)? + (—.188)? +---+ (1.099)? = 7.968 


The estimate of o? is then G? = s* = 7.968/(20 — 2) = .4427, and the estimated 
standard deviation is G = s = V.4427 = .665. Roughly speaking, .665 is the mag- 
nitude of a typical deviation from the estimated regression line—some points are 
closer to the line than this and others are farther away. a 


Computation of SSE from the defining formula involves much tedious 
arithmetic, because both the predicted values and residuals must first be calculated. 
Use of the following computational formula does not require these quantities. 


SSE = > " By a, pay. = Sy = BS, 


The middle expression results from substituting $,= 8, + B,x; into =(y, — $,), 
squaring the summand, carrying through the sum to the resulting three terms, and 
simplifying. These computational formulas are especially sensitive to the effects of 
rounding in Bo and B.. so carrying as many digits as possible in intermediate com- 
putations will protect against round-off error. 
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EXAMPLE 12.8 The article “Promising Quantitative Nondestructive Evaluation Techniques for 
Composite Materials” (Materials Evaluation, 1985: 561-565) reports on a study 
to investigate how the propagation of an ultrasonic stress wave through a substance 
depends on the properties of the substance. The accompanying data on fracture 
strength (x, as a percentage of ultimate tensile strength) and attenuation (y, in neper/ 
cm, the decrease in amplitude of the stress wave) in fiberglass-reinforced polyester 
composites was read from a graph that appeared in the article. The simple linear 
regression model is suggested by the substantial linear pattern in the scatterplot. 


x | 12 30 36 40 45 57 62 67 Zi 78 93 94 100 105 


y | 3.3 32 34 30 28 29 27 26 25 26 2.2 20 2.3 2.1 


The necessary summary quantities are n = 14, 2x, = 890, 2x? = 67,182, Ly, = 
37.6, Ly? = 103.54, and 2x,y, = 2234.30, from which S,, = 10,603.4285714, 
S,, = —155.98571429, B, = —.0147109, and By = 3.6209072. Then 


SSE = 103.54 — (3.6209072)(37.6) — (—.0147109)(2234.30) 
= .2624532 


The same value results from 
SSE = S,, — BS = 103.54 — (37.6)7/14 — (—.0147109)(— 155.98571429) 


Thus s* = .2624532/12 = .0218711 and s = .1479. When Bo and B, are rounded to 
three decimal places in the first computational formula for SSE, the result is 


SSE = 103.54 — (3.621)(37.6) — (—.015)(2234.30) = .905 


which is more than three times the correct value. | 


The Coefficient of Determination 


Figure 12.10 shows three different scatterplots of bivariate data. In all three plots, 
the heights of the different points vary substantially, indicating that there is much 
variability in observed y values. The points in the first plot all fall exactly on a 
straight line. In this case, all (100%) of the sample variation in y can be attributed to 
the fact that x and y are linearly related in combination with variation in x. The points 
in Figure 12.10(b) do not fall exactly on a line, but compared to overall y variability, 
the deviations from the least squares line are small. It is reasonable to conclude in 
this case that much of the observed y variation can be attributed to the approximate 
linear relationship between the variables postulated by the simple linear regression 
model. When the scatterplot looks like that of Figure 12.10(c), there is substantial 
variation about the least squares line relative to overall y variation, so the simple 
linear regression model fails to explain variation in y by relating y to x. 


y ay y 
ie - gros 
ee? o e *e oe? 
; e? - Pe e * e 
- X - XxX - X 
(a) (b) (c) 


Figure 12.10 Using the model to explain y variation: (a) data for which all variation is explained; 
(b) data for which most variation is explained; (c) data for which little variation is explained 
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The error sum of squares SSE can be interpreted as a measure of how much 
variation in y is left unexplained by the model—that is, how much cannot be 
attributed to a linear relationship. In Figure 12.10(a), SSE = 0, and there is no 
unexplained variation, whereas unexplained variation is small for the data of Figure 
12.10(b) and much larger in Figure 12.10(c). A quantitative measure of the total 
amount of variation in observed y values is given by the total sum of squares 


SST = S,,= 10; —3P = Dy? (Sy}/ a 


Total sum of squares is the sum of squared deviations about the sample mean 
of the observed y values. Thus the same number y is subtracted from each y, in SST, 
whereas SSE involves subtracting each different predicted value }, from the cor- 
responding observed y,. Just as SSE is the sum of squared deviations about the least 
squares line y = Bo ate Bix, SST is the sum of squared deviations about the horizon- 
tal line at height y (since then vertical deviations are y; — y), as pictured in Figure 
12.11. Furthermore, because the sum of squared deviations about the least squares 
line is smaller than the sum of squared deviations about any other line, SSE < SST 
unless the horizontal line itself is the least squares line. The ratio SSE/SST is the 
proportion of total variation that cannot be explained by the simple linear regression 
model, and | — SSE/SST (a number between 0 and 1) is the proportion of observed 
y variation explained by the model. 


Horizontal line at height y 
Least squares line 


> X aoa, 


(a) (b) 


Figure 12.11 Sums of squares illustrated: (a) SSE = sum of squared deviations about the least 
squares line; (b) SST = sum of squared deviations about the horizontal line 


DEFINITION The coefficient of determination, denoted by r?, is given by 


It is interpreted as the proportion of observed y variation that can be explained 
by the simple linear regression model (attributed to an approximate linear 
relationship between y and x). 


The higher the value of r?, the more successful is the simple linear regression 
model in explaining y variation. When regression analysis is done by a statistical 
computer package, either r? or 100r? (the percentage of variation explained by the 
model) is a prominent part of the output. If r? is small, an analyst will usually want 
to search for an alternative model (either a nonlinear model or a multiple regression 
model that involves more than a single independent variable) that can more effec- 
tively explain y variation. 
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EXAMPLE 12.9 The scatterplot of the iodine value—cetane number data in Figure 12.8 portends a 
reasonably high 7? value. With 


A 


By = 75.212432 B, = —.20938742 dy; = 779.2 
Say, = 71,347.30 yy = 43,745.22 
we have 


SST = 43,745.22 — (779.2)*/14 = 377.174 
SSE = 43,745.22 — (75.212432)(779.2) — (—.20938742)(71,347.30) = 78.920 


The coefficient of determination is then 
r? = 1 — SSE/SST = 1 — (78.920)/(377.174) = .791 


That is, 79.1% of the observed variation in cetane number is attributable to (can be 
explained by) the simple linear regression relationship between cetane number and 
iodine value (7* values are even higher than this in many scientific contexts, but 
social scientists would typically be ecstatic at a value anywhere near this large!). 

Figure 12.12 shows partial Minitab output from the regression of cetane num- 
ber on iodine value. The software will also provide predicted values, residuals, and 
other information upon request. The formats used by other packages differ slightly 
from that of Minitab, but the information content is very similar. Regression sum of 
squares will be introduced shortly. Other quantities in Figure 12.12 that have not yet 
been discussed will surface in Section 12.3 [excepting R-Sq(adj), which comes into 
play in Chapter 13 when multiple regression models are introduced]. 


The regression equation is 
cet num=75.2—0.209 iod val 


Bo By 
Predictor cour g SE Coef iE P 
Constant 75.212 2.984 25.21 0.000 
iod val -0.20939 0.03109 —6). 73: 0.000 
100r? 
s = 2.56450 R-sq = 79.1% R-sq (adj) Sh S's 
Analysis of Variance SSE 
SOURCE DF Ss MS F P 
Regression 1 298.25 298.25 45.35 0.000 
Error 12 78.92 6.58 
Total 13 377.17 ¢ SST 
Figure 12.12 Minitab output for the regression of Examples 12.4 and 12.9 a 


The coefficient of determination can be written in a slightly different way by 
introducing a third sum of squares—regression sum of squares, SSR—given by 
SSR = 3(5, — y)? = SST — SSE. Regression sum of squares is interpreted as the 
amount of total variation that is explained by the model. Then we have 


r? = | — SSE/SST = (SST — SSE)/SST = SSR/SST 


the ratio of explained variation to total variation. The ANOVA table in Figure 12.12 
shows that SSR = 298.25, from which r? = 298.25/377.17 = .791 as before. 


Terminology and Scope of Regression 
Analysis 


The term regression analysis was first used by Francis Galton in the late nineteenth 
century in connection with his work on the relationship between father’s height x 
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and son’s height y. After collecting a number of pairs (x,, y,), Galton used the princi- 
ple of least squares to obtain the equation of the estimated regression line, with the 
objective of using it to predict son’s height from father’s height. In using the derived 
line, Galton found that if a father was above average in height, the son would also 
be expected to be above average in height, but not by as much as the father was. 
Similarly, the son of a shorter-than-average father would also be expected to be 
shorter than average, but not by as much as the father. Thus the predicted height of 
a son was “pulled back in” toward the mean; because regression means a coming or 
going back, Galton adopted the terminology regression line. This phenomenon of 
being pulled back in toward the mean has been observed in many other situations 
(e.g., batting averages from year to year in baseball) and is called the regression 
effect. 

Our discussion thus far has presumed that the independent variable is under 
the control of the investigator, so that only the dependent variable Y is random. 
This was not, however, the case with Galton’s experiment; fathers’ heights were 
not preselected, but instead both X and Y were random. Methods and conclusions of 
regression analysis can be applied both when the values of the independent variable 
are fixed in advance and when they are random, but because the derivations and 
interpretations are more straightforward in the former case, we will continue to work 
explicitly with it. For more commentary, see the excellent book by John Neter et al. 
listed in the chapter bibliography. 


EXERCISES Section 12.2 (12-29) 


12. Refer back to the data in Exercise 4, in which y = ammo- 


13. 


nium concentration (mg/L) and x = transpiration (ml/h). 


was obtained from the tin-lead rate of deposition as a 
function of current density”? Explain your reasoning. 


Summary quantities include n = 13, =x; = 303.7, Ly; = x | 20 40 60 80 
52.8, S,, = 1585.230769, S,,, = -341.959231, and S,,, = y | 94 1.20 1.71 2.22 
77.270769. 14. Refer to the tank temperature—efficiency ratio data given 
a. Obtain the equation of the estimated regression line in Exercise: 1. 
and use it to calculate a point prediction of ammo- a. Determine the equation of the estimated regression 
nium concentration for a future observation made line. 
when ammonium concentration is 25 ml/h. b. Calculate a point estimate for true average efficiency 
b. What happens if the estimated regression line is ratio when tank temperature is 182. 
used to calculate a point estimate of true average c. Calculate the values of the residuals from the least 
concentration when transpiration is 45 ml/h? Why squares line for the four observations for which tem- 
does it not make sense to calculate this point perature is 182. Why do they not all have the same 
estimate? sign? 
c. Calculate and interpret s. d. What proportion of the observed variation in effi- 
d. Do you think the simple linear regression model does ciency ratio can be attributed to the simple linear 
a good job of explaining observed variation in con- regression relationship between the two variables? 
centration? Explain. 15. Values of modulus of elasticity (MOE, the ratio of stress, 


The accompanying dataon x = current density (mA/cm?) 
and y = rate of deposition(4m/min ) appeared in the 
article “Plating of 60/40 Tin/Lead Solder for Head 
Termination Metallurgy” (Plating and Surface 
Finishing, Jan. 1997: 38-40). Do you agree with the 
claim by the article’s author that “a linear relationship 
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i.e., force per unit area, to strain, i.e., deformation per unit 
length, in GPa) and flexural strength (a measure of the 
ability to resist failure in bending, in MPa) were deter- 
mined for a sample of concrete beams of a certain type, 
resulting in the following data (read from a graph in the 
article “Effects of Aggregates and Microfillers on the 


16. 


Flexural Properties of Concrete,’ Magazine of Concrete 
Research, 1997: 81-98): 


MOE 29.8 33.2 33.7 35.3 35.5 36.1 36.2 
Strength | 39 72 7:3 63 8.1 6.8 7.0 
MOE 36.3. 37.5 37.7 38.7 38.8 39.6 41.0 
Strength | 76 68 65 7.0 63 7.9 9.0 
MOE | 42.8 42.8 43.5 45.6 46.0 46.9 48.0 
Strength | 82 8.7 7.8 97 TA 7.7 9.7 
MOE 49.3 51.7 62.6 69.8 79.5 80.0 

Strength | 7.8 7.7 11.6 11.3 11.8 10.7 


a. Construct a stem-and-leaf display of the MOE val- 
ues, and comment on any interesting features. 

b. Is the value of strength completely and uniquely 
determined by the value of MOE? Explain. 

c. Use the accompanying Minitab output to obtain the 
equation of the least squares line for predicting 
strength from modulus of elasticity, and then pre- 
dict strength for a beam whose modulus of elastic- 
ity is 40. Would you feel comfortable using the 
least squares line to predict strength when modulus 
of elasticity is 100? Explain. 


Predictor Coef Stdev t-ratio P 
Constant 33.2925 0.6008 5.48 0.000 
mod elas 0.10748 0.01280 8.40 0.000 
s =0.8657 R-sq = 73.8% R-sq(adj) = 72.8% 
Analysis of Variance 

SOURCE DF SS MS F P 
Regression 1 52.870 52.870 70.55 0.000 
Error 25 18.736 0.749 

Total 26 71.605 


d. What are the values of SSE, SST, and the coefficient 
of determination? Do these values suggest that the 
simple linear regression model effectively describes 
the relationship between the two variables? Explain. 


The article “Characterization of Highway Runoff in 
Austin, Texas, Area” (J. of Envir. Engr., 1998: 131- 
137) gave a scatterplot, along with the least squares line, 
of x = rainfull volume (m*) and y = runoff volume (m?) 
for a particular location. The accompanying values were 
read from the plot. 

5 12 14 17 23 30 40 47 


x 
y 4 10 13 15 15 25 27 =46 


96 112 
y | 38 46 53 70 82 99 


bay 

Nn 
Nn 
lon 
~— 
~] 
N 
ioe) 
pane 


127 
100 


a. Does a scatterplot of the data support the use of the 
simple linear regression model? 

b. Calculate point estimates of the slope and intercept 
of the population regression line. 


17. 


18. 
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c. Calculate a point estimate of the true average runoff 
volume when rainfall volume is 50. 

d. Calculate a point estimate of the standard deviation o. 

e. What proportion of the observed variation in runoff 
volume can be attributed to the simple linear regres- 
sion relationship between runoff and rainfall? 


No-fines concrete, made from a uniformly graded coarse 
aggregate and a cement-water paste, is beneficial in areas 
prone to excessive rainfall because of its excellent drainage 
properties. The article ‘Pavement Thickness Design for 
No-Fines Concrete Parking Lots,” J. of Trans. Engr., 
1995: 476-484) employed a least squares analysis in study- 
ing how y = porosity (%) is related to x = unit weight 
(pef) in concrete specimens. Consider the following repre- 
sentative data: 


x|99.0 101.1 102.7 103.0 105.4 107.0 108.7 110.8 
y!28.8 27.9 27.0 25.2 22.8 21.5 20.9 19.6 
x} 112.1) 112.4 113.6 113.8 115.1 115.4 120.0 
yl 17.1 189 160 16.7 13.0 136 10.8 
Relevant summary quantities are x; = 1640.1, 
Ly, = 299.8, Xx? = 179,849.73, Xx, = 32,308.59, 
Ly? = 6430.06. 


a. Obtain the equation of the estimated regression line. 
Then create a scatterplot of the data and graph the 
estimated line. Does it appear that the model rela- 
tionship will explain a great deal of the observed 
variation in y? 

Interpret the slope of the least squares line. 

c. What happens if the estimated line is used to predict 
porosity when unit weight is 135? Why is this not a 
good idea? 

d. Calculate the residuals corresponding to the first two 
observations. 

Calculate and interpret a point estimate of o. 

f. What proportion of observed variation in porosity 
can be attributed to the approximate linear relation- 
ship between unit weight and porosity? 


For the past decade, rubber powder has been used in 
asphalt cement to improve performance. The article 
“Experimental Study of Recycled Rubber-Filled 
High-Strength Concrete” (Magazine of Concrete Res., 
2009: 549-556) includes a regression of y = axial 
strength (MPa) on x = cube strength (MPa) based on the 
following sample data: 


x | 112.3 97.0 92.7 86.0 102.0 99.2 95.8 103.5 89.0 86.7 


y | 75.0 71.0 57.7 48.7 74.3 73.3 68.0 59.3 57.8 48.5 


a. Obtain the equation of the least squares line, and 
interpret its slope. 
Calculate and interpret the coefficient of determination. 
c. Calculate and interpret an estimate of the error stan- 
dard deviation o in the simple linear regression model. 
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19. The following data is representative of that reported in the 
article “An Experimental Correlation of Oxides of 
Nitrogen Emissions from Power Boilers Based on Field 
Data” (J. of Engr. for Power, July 1973: 165-170), with 
x = burner-area liberation rate (MBtu/hr-ft?) and y = NO, 
emission rate (ppm): 


x | 100 125 125.150 150 =200 = =—-.200 


y | 150 140 180 = 210 190 320 280 


x | 250 250 300 300 350 86400 =©400 
y | 400 430 440 390 600 610 670 


a. Assuming that the simple linear regression model is 
valid, obtain the least squares estimate of the true 
regression line. 

b. What is the estimate of expected NO, emission rate 
when burner area liberation rate equals 225? 

c. Estimate the amount by which you expect NO, emis- 
sion rate to change when burner area liberation rate 
is decreased by 50. 

d. Would you use the estimated regression line to predict 
emission rate for a liberation rate of 500? Why or why 
not? 


20. The bond behavior of reinforcing bars is an important 
determinant of strength and stability. The article 
“Experimental Study on the Bond Behavior of 
Reinforcing Bars Embedded in Concrete Subjected 
to Lateral Pressure” (J. of Materials in Civil Engr., 
2012: 125-133) reported the results of one experiment in 
which varying levels of lateral pressure were applied to 
21 concrete cube specimens, each with an embedded 
16-mm plain steel round bar, and the corresponding 
bond capacity was determined. Due to differing concrete 
cube strengths ( f.,,, in MPa), the applied lateral pressure 
was equivalent to a fixed proportion of the specimen’s f.,, 
(0, .1f,,,,.--, -6f,,,)- Also, since bond strength can be heay- 
ily influenced by the specimen’s f.,, bond capacity was 
expressed as the ratio of bond strength (MPa) to V/,,, 


cut 


Pressure 0 0 0 ll wl ll 2 


e. What is the value of total variation, and what propor- 
tion of it can be explained by the model relationship? 


The regression equation is 


Ratio = 0.101 + 0.461 Pressure 

Predictor Coef SE Coef T Pp 
Constant 0.10121 0.01308 7.74 0.000 
Pressure 0.46071 0.03627 12.70 0.000 


S=0.0332397 R-Sq=89.5% R-Sq(adj) =88.9% 
Analysis of Variance 

Source DF Ss MS F P 
Regression 1 0.17830 0.17830 161.37 0.000 
Residual Error 19 0.02099 0.00110 

Total 20 0.19929 


21. Wrinkle recovery angle and tensile strength are the two 
most important characteristics for evaluating the perfor- 
mance of crosslinked cotton fabric. An increase in the 
degree of crosslinking, as determined by ester carboxyl 
band absorbence, improves the wrinkle resistance of 
the fabric (at the expense of reducing mechanical 
strength). The accompanying data on x = absorbance 
and y = wrinke resistance angle was read from a graph 
in the paper “Predicting the Performance of Durable 
Press Finished Cotton Fabric with Infrared 
Spectroscopy” (Textile Res. J., 1999: 145-151). 


x | 115 126.183 .246 .282 .344 .355 .452 491 .554 .651 
y | 334 342 355 363 365 372 381 392 400 412 420 


Here is regression output from Minitab: 


Predictor Coef SE Coef © Pp 
Constant 321.878 2.483 129.64 0.000 
absorb 156.711 6.464 24.24 0.000 
S = 3.60498 R-Sq = 98.5% R-Sq(adj) = 98.3% 
Source DF Ss MS F P 
Regression 1 7639.0 7639.0 587.81 0.000 
Residual Error 9 1173.0 1.3:.:0 

Total LO 77'56:.'0 


Ratio 0.123 0.100 0.101 0.172 0.133 0.107 0.217 
Pressure | .2 2 Pe 3 3 A 4 
Ratio 0.172 0.151 0.263 0.227 0.252 0.310 0.365 


Pressure | .4 5 5 5 6 6 6 
Ratio 0.239 0.365 0.319 0.312 0.394 0.386 0.320 


a. Does a scatterplot of the data support the use of the 
simple linear regression model? 

b. Use the accompanying Minitab output to give point 
estimates of the slope and intercept of the population 
regression line. 

c. Calculate a point estimate of the true average bond 
capacity when lateral pressure is .45f,,,,. 

d. What is a point estimate of the error standard devia- 
tion a, and how would you interpret it? 
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a. Does the simple linear regression model appear to be 
appropriate? Explain. 

b. What wrinkle resistance angle would you predict for 
a fabric specimen having an absorbance of .300? 

c. What would be the estimate of expected wrinkle resis- 
tence angle when absorbance is .300? 


22. Calcium phosphate cement is gaining increasing atten- 
tion for use in bone repair applications. The article 
“Short-Fibre Reinforcement of Calcium Phosphate 
Bone Cement” (J. of Engr. in Med., 2007: 203-211) 
reported on a study in which polypropylene fibers were 
used in an attempt to improve fracture behavior. The fol- 
lowing data on x = fiber weight (%) and y = compres- 
sive strength (MPa) was provided by the article’s 
authors. 


x | 0.00 0.00 0.00 0.00 0.00 1.25 1.25 1.25 1.25 


y | 9.94 


11.67 11.00 13.44 9.20 9.92 9.79 10.99 11.32 


2.50 2.50 2.50 2.50 2.50 5.00 5.00 5.00 5.00 


12.29 


7.50 


8.69 9.91 10.45 10.25 7.89 7.61 8.07 9.04 


7.50 7.50 7.50 10.00 10.00 10.00 10.00 


23. 


24. 


6.63 


6.43 7.03 7.63 7.35 694 7.02 7.67 


a. Fit the simple linear regression model to this data. 
Then determine the proportion of observed variation 
in strength that can be attributed to the model rela- 
tionship between strength and fiber weight. Finally, 
obtain a point estimate of the standard deviation of e, 
the random deviation in the model equation. 

b. The average strength values for the six different lev- 
els of fiber weight are 11.05, 10.51, 10.32, 8.15, 6.93, 
and 7.24, respectively. The cited paper included a 
figure in which the average strength was regressed 
against fiber weight. Obtain the equation of this 
regression line and calculate the corresponding coef- 
ficient of determination. Explain the difference 
between the 7° value for this regression and the r° 
value obtained in (a). 


a. Obtain SSE for the data in Exercise 19 from the defin- 
ing formula [SSE = =(y, — $,)7], and compare to the 
value calculated from the computational formula. 

b. Calculate the value of total sum of squares. Does the 
simple linear regression model appear to do an effec- 
tive job of explaining variation in emission rate? 
Justify your assertion. 


The invasive diatom species Didymosphenia geminata 
has the potential to inflict substantial ecological and eco- 
nomic damage in rivers. The article ‘Substrate 
Characteristics Affect Colonization by the Bloom- 
Forming Diatom Didymosphenia geminata (Aquatic 
Ecology, 2010: 33-40) described an investigation of 
colonization behavior. One aspect of particular interest 
was whether y = colony density was related to x = rock 
surface area. The article contained a scatterplot and sum- 
mary of a regression analysis. Here is representative data: 


x 50 71 55 SO 33 58) 679 ~=«(26 
y 152. 1929 48 22 2 Sy 35 7 
x 69 44 37 70 20 45 49 
y 269 38 171 13 43 185 25 
a. Fit the simple linear regression model to this data, 


predict colony density when surface area = 70 and 
when surface area = 71, and calculate the corre- 
sponding residuals. How do they compare? 

b. Calculate and interpret the coefficient of determination. 

c. The second observation has a very extreme y value 
(in the full data set consisting of 72 observations, 
there were two of these). This observation may have 
had a substantial impact on the fit of the model and 


25. 


26. 


27. 


28. 


29. 


12.2 Estimating Model Parameters 509 
subsequent conclusions. Eliminate it and recalculate 
the equation of the estimated regression line. Does it 
appear to differ substantially from the equation 
before the deletion? What is the impact on 7° and 5? 


Show that b, and b, of expressions (12.2) and (12.3) 
satisfy the normal equations. 


Show that the “point of averages” (x, y) lies on the esti- 
mated regression line. 


Suppose an investigator has data on the amount of shelf 
space x devoted to display of a particular product and sales 
revenue y for that product. The investigator may wish to fit 
a model for which the true regression line passes through 
(0, 0). The appropriate model is Y = B,x + €. Assume that 
(x, ¥,),--+5 (x,, ¥,,) are observed pairs generated from this 
model, and derive the least squares estimator of B,. [Hint: 
Write the sum of squared deviations as a function of b,, a 
trial value, and use calculus to find the minimizing value 
of b,.] 


a. Consider the data in Exercise 20. Suppose that 
instead of the least squares line passing through the 
points (x,, y,),.--, (x, y,), We wish the least squares 
line passing through (x, — x, y,),...,(%, —% Y,)- 
Construct a scatterplot of the (x,, y,) points and then 
of the (x; — x, y,) points. Use the plots to explain 
intuitively how the two least squares lines are related 
to one another. 

b. Suppose that instead of the model Y, = By) + Byx; + 
e,(i = 1,...,), we wish to fit a model of the form 
Y, = By + Bi@; — x) +e; G = 1,...,n). What are 
the least squares estimators of 85 and 8}, and how do 
they relate to Bo and By? 


Consider the following three data sets, in which the vari- 
ables of interest are x = commuting distance and 
y = commuting time. Based on a scatterplot and the val- 
ues of s and 7’, in which situation would simple linear 
regression be most (least) effective, and why? 


Data Set 1 2 3 


19 49 29 63 23 31 
115 50 60 


17.50 
29.50 
1.685714 


13.666672 
114.83 
65.10 


1270.8333 
2122.5 


2.142295 


7.868852 
5897.5 
65.10 


1270.8333 
1431.6667 
1.126557 


3.196729 
1627.33 
14.48 
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12.3 Inferences About the Slope Parameter 6, 


In virtually all of our inferential work thus far, the notion of sampling variability has 
been pervasive. In particular, properties of sampling distributions of various statis- 
tics have been the basis for developing confidence interval formulas and hypothesis- 
testing methods. The key idea here is that the value of any quantity calculated from 
sample data—the value of any statistic—will vary from one sample to another. 


EXAMPLE 12.10 Reconsider the data on x = burner area liberation rate and y = NO, emission rate from 
Exercise 12.19 in the previous section. There are 14 observations, made at the x values 
100, 125, 125, 150, 150, 200, 200, 250, 250, 300, 300, 350, 400, and 400, respectively. 
Suppose that the slope and intercept of the true regression line are B, = 1.70 and By = 
—50, with o = 35 (consistent with the values B, = 1.7114, Be = —45.55, s = 36.75). 
We proceeded to generate a sample of random deviations €,,..., €,, from a normal 
distribution with mean 0 and standard deviation 35 and then added €; to By + Bix; 
to obtain 14 corresponding y values. Regression calculations were then carried out 
to obtain the estimated slope, intercept, and standard deviation. This process was 
repeated a total of 20 times, resulting in the values given in Table 12.1. 


Table 12.1 Simulation Results for Example 12.10 


By Bo s By Bo s 
1. 1.7559 —60.62 43.23 11. 1.7843 —67.36 41.80 
2. 1.6400 —49.40 30.69 12. 1.5822 —28.64 32.46 
3. 1.4699 —4.80 36.26 13. 1.8194 — 83.99 40.80 
4. 1.6944 —41.95 22.89 14. 1.6469 —32.03 28.11 
5. 1.4497 5.80 36.84 15. 1.7712 —52.66 33.04 
6. 1.7309 —70.01 39.56 16. 1.7004 —58.06 43.44 
7. 1.8890 —95.01 42.37 17. 1.6103 —27.89 25.60 
8. 1.6471 —40.30 43.71 18. 1.6396 —24.89 40.78 
9. 1.7216 —42.68 23.68 19. 1.7857 —77.31 32.38 
10. 1.7058 — 63.31 31.58 20. 1.6342 —17.00 30.93 


There is clearly variation in values of the estimated slope and estimated inter- 
cept, as well as the estimated standard deviation. The equation of the least squares 
line thus varies from one sample to the next. Figure 12.13 on page 511 shows a 
dotplot of the estimated slopes as well as graphs of the true regression line and the 
20 sample regression lines. a 


The slope B, of the population regression line is the true average change in the 
dependent variable y associated with a |-unit increase in the independent variable x. 
The slope of the least squares line, B, gives a point estimate of 8,. In the same way 
that a confidence interval for ~ and procedures for testing hypotheses about jz were 
based on properties of the sampling distribution of X, further inferences about B, 
are based on thinking of B, as a Statistic and investigating its sampling distribution. 

The values of the x,’s are assumed to be chosen before the experiment is 
performed, so only the Y,’s are random. The estimators (statistics, and thus random 
variables) for 6) and B, are obtained by replacing y, by Y, in (12.2) and (12.3): 


. oe-xY,-Y) . SdSy,-B,dx,; 
1 o7 


SG, ad xP a 
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600 4 

500 4 

400 4 

al 
300 5 
200 4 
100 150 200 250 300 350 400 
x 

True regression line 
Simulated least squares lines 


(b) 


Figure 12.13 Simulation results from Example 12.10: (a) dotplot of estimated slopes; (b) 
graphs of the true regression line and 20 least squares lines (from S-Plus) 


Similarly, the estimator for 0? results from replacing each y, in the formula for s? 
by the rv Y;: 


pee — Po a ey aé 


n-2 


C= = 


The denominator of Bi. Sy. = U(x; — ¥), depends only on the x;’s and not on the 
Y,’s, so it is a constant. Then because =(x; — x)Y = YX(x; — x) = Y- 0 = 0, the 
slope estimator can be written as 


m (x; — OY; 
a > ; =Scy, where c; = (x; — x)/S,, 


XX 


That is, B; is a linear function of the independent rv’s Y,, Y;,..., Y,, each of 
which is normally distributed. Invoking properties of a linear function of random 
variables discussed in Section 5.5 leads to the following results. 
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PROPOSITION 1. The mean value of B, is E(B,) = a, = B, so B, is an unbiased estimator 
of B, (the distribution of 8, is always centered at the value of B,). 


2. The variance and standard deviation of By are 


VB) = o}= o, =—= (12.4) 


where S_, = X(x, — x)* = =x? — (2x,)*/n. Replacing o by its estimate s 
gives an estimate for o,f, (the estimated standard deviation, i.e., estimated 
standard error, of 8,): 


(This estimate can also be denoted by G%..) 


3. The estimator B, has a normal distribution (because it is a linear function 
of independent normal rv’s). 


According to (12.4), the variance of B, equals the variance o7 of the random error 
term—or, equivalently, of any Y,, divided by =(x; — x)?. This denominator is a 
measure of how spread out the x,’s are about x. Therefore making observations at 
x, values that are quite spread out results in a more precise estimator of the slope 
parameter (smaller variance of B )), Whereas values of x; all close to one another 
imply a highly variable estimator. Of course, if the x;’s are spread out too far, a linear 
model may not be appropriate throughout the range of observation. 

Many inferential procedures discussed previously were based on standardiz- 
ing an estimator by first subtracting its mean value and then dividing by its estimated 
standard deviation. In particular, test procedures and a CI for the mean p of a nor- 
mal population utilized the fact that the standardized variable (X — w)/(S/Vn)— 
that is, (X¥ — y)/ S,—had a ¢ distribution with n — 1 df. A similar result here pro- 
vides the key to further inferences concerning f,. 


THEOREM The assumptions of the simple linear regression model imply that the 
standardized variable 


_ Bi Bi _ BiB 
S/IVS, Sf, 


has a ¢ distribution with n — 2 df. 


A Confidence Interval for B, 


As in the derivation of previous CIs, we begin with a probability statement: 


A taae-t < ee a < te) =1—-2a 
B 


1 
Manipulation of the inequalities inside the parentheses to isolate 8, and substitution 


of estimates in place of the estimators gives the CI formula. 
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A 100(1 — a)% CI for the slope f, of the true regression line is 


eae me 
Pye a/2n—-2 ° 58, 


This interval has the same general form as did many of our previous intervals. It 
is centered at the point estimate of the parameter, and the amount it extends out to 
each side depends on the desired confidence level (through the f¢ critical value) and 
on the amount of variability in the estimator B | (through sg, which will tend to be 
small when there is little variability in the distribution of 8, and large otherwise). 


EXAMPLE 12.11 When damage to a timber structure occurs, it may be more economical to repair 
the damaged area rather than replace the entire structure. The article “Simplified 
Model for Strength Assessment of Timber Beams Joined by Bonded Plates” 
(J. of Materials in Civil Engr., 2013: 980-990) investigated a particular strategy 
for repair. The accompanying data was used by the authors of the article as a basis 
for fitting the simple linear regression model. The dependent variable is y = rupture 
load (N) and the independent variable is anchorage length (the additional length of 
material used to bond at the junction, in mm). 


x | 50 50 80 80 110 110 140 140 170 170 
y | 17,052 14,063 26,264 19,600 21,952 26,362 26,362 26,754 31,654 32,928 


Note that the relationship between anchorage length and rupture load is clearly not 
deterministic, since there are observations with identical x values but different y values. 
Figure 12.14 shows a scatterplot of the data (also displayed in the cited article); there 
appears to be a rather substantial positive linear relationship between the two variables. 


Rupt load 
A 
35000 5 
® 
e 
30000 — 
e ° 8 
25000 5 
e 
20000 — é 
e 
15000 - 
® 


T T T T T -——> Anch length 
50 75 100 125 150 175 


Figure 12.14 Scatterplot of the data from Example 12.11 


Summary quantities include S.. = 18,000, S.. = 2,225,579.40, S,. = SST = 
331,839,568.9, B, = 123.6433, Bo 10,698.33, SSE = 56,661,439.1, and 7? = .829. 
Roughly 83% of the observed variation in rupture load can be attributed to the simple 
linear regression model relationship between rupture load and anchor length. Error df 
is 10 — 2 = 8, from which s? = 56,661,439.1/8 = 7,082,679.89 and s = 2661.33. The 
estimated standard error of B, is 


; s 2661.33 
*b V/S-\/18,000 

A confidence level of 95% requires t.));, = 2.306. The Cl is 
123.64 © (2.306)(19.836) = 123.64 + 45.74 = (77.90,169.38) 


= 19.836 
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With a high degree of confidence, we estimate that an increase in true average rupture 
strength of between 77.90 N and 169.38 N is associated with an increase of 1 mm in 
anchorage length (at least for anchorage lengths between 50 mm and 170 mm). This 
interval is not overly narrow, a consequence of the small sample size and substantial 
variability about the estimated regression line. Notice that the interval includes only 
positive values, so we can be quite confident of the tendency for strength to increase 
as anchorage length increases. 

Figure 12.15 displays regression output from the SAS package. The value of 
sg is found under Parameter Estimates as the second number in the Standard Error 
column. There is also an estimated standard error for By, from which a confidence 
interval for the intercept of the population regression line can be calculated. The last 
two columns of the Parameter Estimates table give information about testing certain 
hypotheses, our next topic of discussion. 


Source 


DF 


Sum of 
Squares 


Mean 
Square 


F Value 


[ie 2 Jel 


Model 


Error 


275178130 
56661439 


275178130 
7082680 


38.85 


Corrected Total 9] 331839569 
Root MSE 2661.33047 | R- 0.8293 
Square 
Dependent Mean 24299 a R | 0.8079 
Coeff Var 10.95238 


Variable 


Intercept 


DF 


Parameter 
Estimate 


10698 


Standard 
Error 


2338.67544 


t Value 


Pr >Itl 


Anch Lngth 


a 


123.64333 


19.83639 


6.23 


0.0003 


Figure 12.15 SAS output for the data of Example 12.11 a 


Hypothesis-Testing Procedures 


As before, the null hypothesis in a test about 6, will be an equality statement. The 
null value (value of 8, claimed true by the null hypothesis) is denoted by B,, (read 
“beta one nought,” not “beta ten”). The test statistic results from replacing B, by 
the null value 6, in the standardized variable 7—that is, from standardizing the 
estimator of 6, under the assumption that H) is true. The test statistic thus has a 
t distribution with n — 2 df when H, is true, which allows for determination of a 
P-value as described for ¢ tests in Chapters 8 and 9. 

The most commonly encountered pair of hypotheses about B, is Hp: B, = 0 ver- 
sus H,: B,; # 0. When this null hypothesis is true, wy. , = By independent of x. Then 
knowledge of x gives no information about the value of the dependent variable. A test 
of these two hypotheses is often referred to as the model utility test in simple linear 
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EXAMPLE 12.12 


12.3 Inferences About the Slope Parameter 8, 515 
regression. Unless n is quite small, Hp will be rejected and the utility of the model con- 
firmed precisely when 7? is reasonably large. The simple linear regression model should 
not be used for further inferences (estimates of mean value or predictions of future 


values) unless the model utility test results in rejection of H, for a suitably small a. 


Null hypothesis: Hy: B, = Bio 


By i Bio 


Test statistic value: f = ————— 


SB, 


Alternative Hypothesis 
A: B, > Bio 


A: By < Bio 
A: By # Bio 


P-Value Determination 


Area under the ¢, _ 5 curve to the right of ¢ 
Area under the ¢,,_ , curve to the left of ¢ 
2 - (Area under the f,,_ , curve to the right of If!) 


The model utility test is the test of Hy: 8B, = 0 versus H,: B, # 0, in which 
case the test statistic value is the ¢ ratio t = B,/ 58,- 


Mopeds are very popular in Europe because of cost and ease of operation. However, 
they can be dangerous if performance characteristics are modified. One of the fea- 
tures commonly manipulated is the maximum speed. The article “Procedure to 
Verify the Maximum Speed of Automatic Transmission Mopeds in Periodic 
Motor Vehicle Inspections” (J. of Automotive Engr., 2008: 1615-1623) included 
a simple linear regression analysis of the variables x = test track speed (km/h) and 
y = rolling test speed. Here is data read from a graph in the article: 


x 42.2 42.6 43.3 43.5 43.7 44.1 44.9 45.3 45.7 


y 44 4444 45 45 46 46 46 47 


x 45.7 45.9 46.0 46.2 46.2 46.8 46.8 47.1 47.2 


y 48 48 48 47 48 48 49 49 49 


A scatterplot of the data shows a substantial linear pattern. The Minitab output 
in Figure 12.16 gives the coefficient of determination as r? = .923, which certainly 
portends a useful linear relationship. Let’s carry out the model utility test at a 
significance level a = .01. 


The regression equation is Sp B, 
roll spd = —2.22 + 1.08 trk spd : t=— 

es 

B, 
Predictor Coef SE Coef T P 
Constant —2.224 3.528 —0.63 0.537 
trk spd 1.08342 0.07806 13.88 0. 000¢~ P-value for model 
S = 0.506890 R-Sq = 92.3% R-Sq(adj) = 91.9% Waly teat 
Analysis of Variance 
Source DF Ss MS F P 
Regression ie 49.500 49.500 192.65 0.000 
Residual Error 16 4.111 0.257 
Total 7 53.611 


Figure 12.16 Minitab output for the moped data of Example 12.12 
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The parameter of interest is B,, the expected change in rolling track speed 
associated with a 1 km/h increase in test speed. The null hypothesis Hp: B, = 0 will 
be rejected in favor of the alternative Hy: 8, ~ 0 if the t ratio t = B,/s g, falls too far 
into either tail of the ¢, _ , curve (resulting in a small P-value). From Figure 12.16, 
B, = 1.08342, sg = .07806, and 

1.08342 


t= 07806 = 13.88 (also on output) 


The P-value is twice the area captured under the 16 df ¢ curve to the right of 13.88. 
Minitab gives P-value = .000. Thus the null hypothesis of no useful linear relation- 
ship can be rejected at any reasonable significance level. This confirms the utility 
of the model, and gives us license to calculate various estimates and predictions as 
described in Section 12.4. (= 


Regression and ANOVA 


The decomposition of the total sum of squares =(y; — y)? into a part SSE, which 
measures unexplained variation, and a part SSR, which measures variation explained 
by the linear relationship, is strongly reminiscent of one-way ANOVA. In fact, the 
null hypothesis Hp: B,; = 0 can be tested against H,: B, # 0 by constructing an 
ANOVA table (Table 12.2) and determining the P-value for the F test. 


Table 12.2 ANOVA Table for Simple Linear Regression 


Source of Variation df Sum of Squares Mean Square f 
R i 1 SSR SSR — 
egression = 
SSE/(n — 2) 
SSE 
Error n=2 SSE es 
n= 2 
Total n=l SST 


The F test gives exactly the same result as the model utility t test because 7? = f 
and foi ,-2 = Fa,j,—2- Virtually all computer packages that have regression options 
include such an ANOVA table in the output. For example, Figure 12.15 shows SAS 
output for the rupture load data of Example 12.11. The ANOVA table at the top of 
the output has f = 38.85 with a P-value of .0003 for the model utility test. The table 
of parameter estimates gives ¢ = 6.23, again with P = .0003 and 38.85 ~ (6.23)° 


(they would be identical if more decimal accuracy were shown). 


EXERCISES Section 12.3 (30—43) 


30. Reconsider the situation described in Exercise 7, in which x, = 1000, x, = 1500, x; = 2000, x, = 2500, x; = 3000, 
x = accelerated strength of concrete and y = 28-day X¢ = 3500, and x, = 4000. 
cured strength. Suppose the simple linear regression a. Calculate og , the standard deviation of B.. 
model is valid for x between 1000 and 4000 and that b. What is the probability that the estimated slope 
B, = 1.25 and o = 350. Consider an experiment in which based on such observations will be between 1.00 and 
n = 7, and the x values at which observations are made are 1.50? 
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c. Suppose it is also possible to make a single observa- 
tion at each of the n = 11 values x, = 2000, x, = 
2100,..., x; = 3000. If a major objective is to esti- 
mate B, as accurately as possible, would the experi- 
ment with n = 11 be preferable to the one with 
n= 72 

31. During oil drilling operations, components of the drilling 
assembly may suffer from sulfide stress cracking. The 
article “Composition Optimization of High-Strength 

Steels for Sulfide Cracking Resistance Improvement” 

(Corrosion Science, 2009: 2878-2884) reported on a 

study in which the composition of a standard grade of steel 

was analyzed. The following data on y = threshold stress 

(% SMYS) and x = yield strength (MPa) was read from a 

graph in the article (which also included the equation of 

the least squares line). 


x | 635 644 711 708 836 820 810 870 856 923 878 937 948 


y | 100 93 88 84 77 75 74 63 57 55 47 43 38 


dix; = 10,576, Sy, = 894, Dx? = 8,741,264, 
Sy? = 66,224, Six, = 703,192 


a. What proportion of observed variation in stress can 
be attributed to the approximate linear relationship 
between the two variables? 

b. Compute the estimated standard deviation s¢ . 

c. Calculate a confidence interval using confidence 
level 95% for the expected change in stress associ- 
ated with a 1 MPa increase in strength. Does it 
appear that this true average change has been pre- 
cisely estimated? 


32. Exercise 16 of Section 12.2 gave data on x = rainfall 
volume and y = runoff volume (both in m*). Use the 
accompanying Minitab output to decide whether there is 
a useful linear relationship between rainfall and runoff, 
and then calculate a confidence interval for the true 
average change in runoff volume associated with a 1 m? 
increase in rainfall volume. 


The regression equation is 
runoff = —-1.13+ 0.827 rainfall 


Predictor Coef Stdev t-ratio P 
Constant —-1.128 2.368 —0.48 0.642 
rainfall 0.82697 0.03652 22.64 0.000 
s=5.240 R-sq = 97.5% R-sq(adj) = 97.3% 


33. Exercise 15 of Section 12.2 included Minitab output for 
a regression of flexural strength of concrete beams on 
modulus of elasticity. 

a. Use the output to calculate a confidence interval with 
a confidence level of 95% for the slope B, of the 
population regression line, and interpret the resulting 
interval. 

b. Suppose it had previously been believed that when 
modulus of elasticity increased by 1 GPa, the associ- 
ated true average change in flexural strength would 
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be at most .1 MPa. Does the sample data contradict 
this belief? State and test the relevant hypotheses. 


34. Electromagnetic technologies offer effective nondestruc- 
tive sensing techniques for determining characteristics of 
pavement. The propagation of electromagnetic waves 
through the material depends on its dielectric properties. 
The following data, kindly provided by the authors of the 
article “Dielectric Modeling of Asphalt Mixtures and 
Relationship with Density” (J. of Transp. Engr., 2011: 
104-111), was used to relate y = dielectric constant to x = 
air void (%) for 18 samples having 5% asphalt content: 


y | 4.55 449 4.50 4.47 4.47 4.45 4.40 4.34 4.43 
x | 4.35 4.79 5.57 5.20 5.07 5.79 5.36 6.40 5.66 
y | 443 4.42 440 4.33 444 4.40 4.26 4.32 4.34 
x | 5.90 6.49 5.70 6.49 6.37 6.51 7.88 6.74 7.08 


The following R output is from a simple linear regres- 
sion of y on x: 


Estimate Std. Error 
4.858691 0.059768 
—0.074676 0.009923 


t value Pr(>|t |) 
81.283 <2e-16 
—7.526 1.21e-06 


(Intercept) 
AirVoid 


Residual standard error: 0.03551 on 16 DF Multiple 
R-squared: 0.77975. Adjusted R-squared: 0.766 
F-statistic: 56.63 on 1 and 16 DF, p-value: 1.214e-06 


Analysis of Variance Table 


Response: Dielectric 


DE Sum Sq Mean Sq F value Pr (>F) 
Airvoid 1 0.071422 0.071422 56.635 1.214e-06 
Residuals 16 0.20178 0.001261 


a. Obtain the equation of the least squares line and 
interpret its slope. 

b. What proportion of observed variation in dielectric 
constant can be attributed to the approximate linear 
relationship between dielectric constant and air void. 

c. Does there appear to be a useful linear relationship 
between dielectric constant and air void? State and 
test the appropriate hypotheses. 

d. Suppose it had previously been believed that when 
air void increased by 1| percent, the associated true 
average change in dielectric constant would be at 
least —.05. Does the sample data contradict this 
belief? Carry out a test of appropriate hypotheses 
using a significance level of .01. 


35. How does lateral acceleration—side forces experienced 
in turns that are largely under driver control—affect nau- 
sea as perceived by bus passengers? The article ‘Motion 
Sickness in Public Road Transport: The Effect of 
Driver, Route, and Vehicle” (Ergonomics, 1999: 1646— 
1664) reported data on x = motion sickness dose (calcu- 
lated in accordance with a British standard for evaluating 
similar motion at sea) and y = reported nausea (%). 
Relevant summary quantities are 


n=17, dx, = 222.1, Diy, = 193, >)x? = 3056.69, 


>'x.y; = 2759.6, Diy? = 2975 
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36. 


37. 


Values of dose in the sample ranged from 6.0 to 17.6. 

a. Assuming that the simple linear regression model is 
valid for relating these two variables (this is suppor- 
ted by the raw data), calculate and interpret an esti- 
mate of the slope parameter that conveys information 
about the precision and reliability of estimation. 

b. Does it appear that there is a useful linear relation- 
ship between these two variables? Test appropriate 
hypotheses using a = .O1. 

c. Would it be sensible to use the simple linear regres- 
sion model as a basis for predicting % nausea when 
dose = 5.0? Explain your reasoning. 

d. When Minitab was used to fit the simple linear 
regression model to the raw data, the observation 
(6.0, 2.50) was flagged as possibly having a substan- 
tial impact on the fit. Eliminate this observation from 
the sample and recalculate the estimate of part (a). 
Based on this, does the observation appear to be 
exerting an undue influence? 


Mist (airborne droplets or aerosols) is generated when metal- 
removing fluids are used in machining operations to cool 
and lubricate the tool and workpiece. Mist generation is a 
concern to OSHA, which has recently lowered substantially 
the workplace standard. The article “Variables Affecting 
Mist Generaton from Metal Removal Fluids” 
(Lubrication Engr., 2002: 10-17) gave the accompanying 
data on x = fluid-flow velocity for a 5% soluble oil (cm/sec) 
and y = the extent of mist droplets having diameters smaller 
than 10 xm (mg/m?): 


x 89 177 189 354 362 442 965 


40 60 48 66 61 69 .99 


= 


a. The investigators performed a simple linear regres- 
sion analysis to relate the two variables. Does a scat- 
terplot of the data support this strategy? 

b. What proportion of observed variation in mist can be 
attributed to the simple linear regression relationship 
between velocity and mist? 

c. The investigators were particularly interested in the 
impact on mist of increasing velocity from 100 to 
1000 (a factor of 10 corresponding to the difference 
between the smallest and largest x values in the 
sample). When x increases in this way, is there sub- 
stantial evidence that the true average increase in y is 
less than .6? 

d. Estimate the true average change in mist associated 
with a 1 cm/sec increase in velocity, and do so ina 
way that conveys information about precision and 
reliability. 


Magnetic resonance imaging (MRI) is well established as 
a tool for measuring blood velocities and volume flows. 
The article “Correlation Analysis of Stenotic Aortic 
Valve Flow Patterns Using Phase Contrast MRI,” refer- 
enced in Exercise 1.67, proposed using this methodology 
for determination of valve area in patients with aortic 
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38. 


39. 


40. 


41. 


42. 


43. 


stenosis. The accompanying data on peak velocity (m/s) 
from scans of 23 patients in two different planes was read 
from a graph in the cited paper. 


Level-| 60 .82 85 .89 .95 1.01 1.01 1.05 
Level--| 50 .68 .76 64 .68 .86 .79 1.03 
Level- | 1.08 1.11 1.18 1.17 1.22 1.29 1.28 1.32 
Level--| .75. 90 .79 86 .99 80 1.10 1.15 
Level- | 1.37 1.53 1.55 1.85 1.93 1.93 2.14 
Level--| 1.04 1.16 1.28 1.39 1.57 1.39 1.32 


a. Does there appear to be a difference between true 
average velocity in the two different planes? Carry 
out an appropriate test of hypotheses (as did the 
authors of the article). 

b. The authors of the article also regressed level-- 
velocity against level- velocity. The resulting esti- 
mated intercept and slope are .14701 and .65393, with 
corresponding estimated standard errors .07877 and 
.05947, coefficient of determination .852, and 
s = .110673. The article included a comment that this 
regression showed evidence of a strong linear rela- 
tionship but a regression slope well below 1. Do you 
agree? 


Refer to the data on x = liberation rate and y = NO, 

emission rate given in Exercise 19. 

a. Does the simple linear regression model specify a 
useful relationship between the two rates? Use the 
appropriate test procedure to obtain information 
about the P-value, and then reach a conclusion at 
significance level .01. 

b. Compute a 95% CI for the expected change in emission 
rate associated with a 10 MBtu/hr-ft’ increase in libera- 
tion rate. 


Carry out the model utility test using the ANOVA 
approach for the filtration rate—moisture content data of 
Example 12.6. Verify that it gives a result equivalent to 
that of the f test. 


Use the rules of expected value to show that Bo is an unbi- 
ased estimator for B, (assuming that 6, is unbiased for B,). 


a. Verify that E(B,) = B, by using the rules of expected 
value from Chapter 5. 

b. Use the rules of variance from Chapter 5 to verify the 
expression for Vi(B,) given in this section. 


Verify that if each x; is multiplied by a positive constant c 
and each y; is multiplied by another positive constant d, the 
t statistic for testing Hp: 8B, = 0 versus H,:B, #0 is 
unchanged in value (the value of B, will change, which 
shows that the magnitude of B, is not by itself indicative of 
model utility). 


The probability of a type Il error for the ¢f test for 
Hy: B,; = By) can be computed in the same manner as it 


12.4 


was computed for the f tests of Chapter 8. If the alterna- 
tive value of B, is denoted by B}, the value of 


1B ~ Bil 
n—-1 
S 


d= 


xXx 


is first calculated, then the appropriate set of curves in 
Appendix Table A.17 is entered on the horizontal axis at the 
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article in the Journal of Public Health Engineering 
reports the results of a regression analysis based on n = 15 
observations in which x = filter application temerature 
(°C) and y = % efficiency of BOD removal. Calculated 
quantities include Sx, = 402, 2x? = 11,098, s = 3.725, 
and B, = 1.7035. Consider testing at level .01 Hy: B, = 1, 
which states that the expected increase in % BOD removal 
is 1 when filter application temperature increases by 1°C, 
against the alternative H,: 8, > 1. Determine P(type II 


value of d, and B is read from the curve for n — 2 df. An error) when B; = 2, 0 = 4. 


12.4 Inferences Conceming py. ,.and 


the Prediction of Future Y Values 


Let x* denote a specified value of the independent variable x. Once the estimates Bo 
and B , have been calculated, Bo + B .** can be regarded either as a point estimate 
of jy. ,« (the expected or true average value of Y when x = x*) or as a prediction of 
the Y value that will result from a single observation made when x = x*. The point 
estimate or prediction by itself gives no information concerning how precisely py... 
has been estimated or Y has been predicted. This can be remedied by developing a 
CI for py. ,. and a prediction interval (PI) for a single Y value. 

Before we obtain sample data, both Bo and B, are subject to sampling 
variability—that is, they are both statistics whose values will vary from sample to 
sample. Suppose, for example, that 6, = 50 and 6, = 2. Then a first sample of 
(x, y) pairs might give By = = 52.35, B, = 1.895; a second sample might result in 
By = 46.52, B, = 2.056; and so on. It follows that Y = B, + B,x* itself varies in 
value from sample to sample, so it is a statistic. If the intercept and slope of the popu- 
lation line are the aforementioned values 50 and 2, respectively, and x* = 10, then 
this statistic is trying to estimate the value 50 + 2(10) = 70. The estimate from a 
first sample might be 52.35 + 1.895(10) = 71.30, from a second sample might be 
46.52 + 2.056(10) = 67.08, and so on. 

This variation in the value of Bo + Byx* can be visualized by returning to 
Figure 12.13 on page 511. Consider the value x* = 300. The heights of the 20 pic- 
tured estimated regression lines above this value are all somewhat different from 
one another. The same is true of the heights of the lines above the value x* = 350. 
In fact, there appears to be more variation in the value of Bo + B,(350) than in the 
value of Bo + B (300). We shall see shortly that this is because 350 is further from 
x = 235.71 (the “center of the data’) than is 300. 

Methods for making inferences about 8, were based on properties of the sam- 
pling distribution of the statistic B ,- In the same way, inferences about the mean Y 
value Bo + B,x* are based on properties of the : sampling distribution of the statistic 
Bo a Bx". Substitution of the expressions for Bo and B, into Bo. + Byx* followed by 
some algebraic manipulation leads to the representation of Bo +f B.x* as a linear func- 
tion of the Y,’s: 


A a a 1 (AF = it 2) ‘ 
By + B.x* = >| + eae Day, 


i=l 
The coefficients d,, d,,...,d, in this linear function involve the x,’s and x*, all of 
which are fixed. Application of the rules of Section 5.5 to this linear function gives 
the following properties. 
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PROPOSITION Let Y = B, + B,x*, where x* is some fixed value of x. Then 
1. The mean value of Y is 
E(Y) = E(B, + B,x*) = Ma, +p,2e = Bo + Bix" 
Thus Bo aF Bix* is an unbiased estimator for B, + B,x* (i.e., for py. ,x). 


2. The variance of Y is 


ee v3 = 2 
Whnot ote Oo Jaf 1 owt 


aa 
iL Dee, = (Sx in i See 
and the standard deviation o7 is the square root of this expression. The 


estimated standard deviation of Bo ats Bix", denoted by sy or sg ,4 .*, results 
from replacing o by its estimate s: 


1 (@*— x) 
oe x) 
n SS 


XX 


Se = SB+B.x* ie 


3. Y has a normal distribution. 


The variance of Bo af B .«* is smallest when x* = x and increases as x* moves away 
from x in either direction. Thus the estimator of py. ,. 18 more precise when x* is 
near the center of the x,;’s than when it is far from the x values at which observations 
have been made. This will imply that both the CI and PI are narrower for an x* near 
x than for an x* far from x. Most statistical computer packages will provide both 
Bo +f Byx* and sg, .4,.« for any specified x* upon request. 


Inferences Concerning py. ,« 


Just as inferential procedures for 6, were based on the ¢ variable obtained by stand- 
ardizing B,, a t variable obtained by standardizing 6) + 6,x* leads to a CI and test 
procedures here. 


THEOREM The variable 
T By ap Boe — (Bo + Byx*) y- (By + B\x*) 
SB,+ By Sy 


(28%) 


has at distribution with n — 2 df. 


A probability statement involving this standardized variable can now be manipu- 
lated to yield a confidence interval for py.,.. 


A 100(1 — a)% CI for py.,., the expected value of Y when x = x*, is 


Bo as B,x* ae a/2,n-2 ° S6.+B.x* y 3e Lena 2 “Sy (12.6) 


This Cl is centered at the point estimate for py.,. and extends out to each side by an 
amount that depends on the confidence level and on the extent of variability in the 
estimator on which the point estimate is based. 
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EXAMPLE 12.13 Corrosion of steel reinforcing bars is the most important durability problem for rein- 
forced concrete structures. Carbonation of concrete results from a chemical reaction 
that lowers the pH value by enough to initiate corrosion of the rebar. Representative 
data on x = carbonation depth (mm) and y = strength (MPa) for a sample of core 
specimens taken from a particular building follows (read from a plot in the article 
“The Carbonation of Concrete Structures in the Tropical Environment of 
Singapore,” Magazine of Concrete Res., 1996: 293-300). 


x 8.0 15.0 16.5 20.0 20.0 27.5 30.0 30.0 35.0 
y 22.8 27.2 23.1 17.1 21.5 18.6 16.1 23.4 13.4 
x 38.0 40.0 45.0 50.0 50.0 55.0 55.0 59.0 65.0 
y 19.5 12.4 13.2 11.4 10.3 14.1 9.7 12.0 6.8 


Y = 27.1829 — 0.297561X 


R-Sq = 76.6 % 
30 - 
20 + 
= 
D 
c 
2 
i7,) 
10 - 
Regression 
eta 95% Cl 
—-— 95%PI 
0- 
T 7 T T T I T T 
0 10 20 30 40 50 60 70 


depth 


Figure 12.17 Minitab scatterplot with confidence intervals and prediction intervals for the data 
of Example 12.13 


A scatterplot of the data (see Figure 12.17) gives strong support for use of the simple 
linear regression model. Relevant quantities are as follows: 
Sx; = 659.0 Six? = 28,967.50 x*=36.6111 — S,,=4840.7778 
dy; = 293.2 Dixy, = 9293.95 Sly? = 5335.76 
B, = —.297561 By = 27.182936 SSE = 131.2402 
r = .766 s = 2.8640 
Let’s now calculate a confidence interval, using a 95% confidence level, for the mean 


strength for all core specimens having a carbonation depth of 45 mm—that is, a con- 
fidence interval for B, + B,(45). The interval is centered at 


$= By + B,(45) = 27.18 — .2976(45) = 13.79 
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The estimated standard deviation of the statistic Y¥ is 


1 (45 — 36.6111) 
sp = 2.8640 + = .7582 


18 4840.7778 


The 16 df ¢ critical value for a 95% confidence level is 2.120, from which we deter- 
mine the desired interval to be 


13.79 + (2.120)(.7582) = 13.79 + 1.61 = (12.18, 15.40) 


The narrowness of this interval suggests that we have reasonably precise information 
about the mean value being estimated. Remember that if we recalculated this interval 
for sample after sample, in the long run about 95% of the calculated intervals would 
include By + B,(45). We can only hope that this mean value lies in the single interval 
that we have calculated. 

Figure 12.18 shows Minitab output resulting from a request to fit the simple 
linear regression model and calculate confidence intervals for the mean value of 
strength at depths of 45 mm and 35 mm. The intervals are at the bottom of the out- 
put; note that the second interval is narrower than the first, because 35 is much closer 
to x than is 45. Figure 12.17 shows (1) curves corresponding to the confidence limits 
for each different x value and (2) prediction limits, to be discussed shortly. Notice 
how the curves get farther and farther apart as x moves away from x. 


The regression equation is strength =27.2—0.298 depth 


Predictor Coef Stdev t-ratio P 
Constant 27.183 1.651. 16.46 0.000 
depth —0.29756 0.04116 =7.23' 0.000 
Ss = 2.864 R-sq = 76.6% R-sq(adj) = 75.1% 
Analysis of Variance 
SOURCE DF Ss MS F P 
Regression 1 428.62 428.62 52.25 0.000 
Error 16 131.24 8.20 
Total LT 559.86 
Pit Stdev. Fit 95.08 C..1. 95.06 Ps 15 
133793 0.758 (12.185, 15.401) (7.520, 20,075) 
Fit Stdev.Fit 95.0% Ci. 95.06 PI, 
16.768 0.678 (15.330, 18.207) (10.527, 23.009) 
Figure 12.18 Minitab regression output for the data of Example 12.13 a 


In some situations, a CI is desired not just for a single x value but for two or 
more x values. Suppose an investigator wishes a CI both for wy. and for pry.,,, where 
v and w are two different values of the independent variable. It is tempting to compute 
the interval (12.6) first for x = v and then for x = w. Suppose we use a = .05 in each 
computation to get two 95% intervals. Then if the variables involved in computing the 
two intervals were independent of one another, the joint confidence coefficient would 
be (.95) - (.95) ~ .90. 

However, the intervals are not independent because the same Bo» B, and S are 
used in each. We therefore cannot assert that the joint confidence level for the two 
intervals is exactly 90%. It can be shown, though, that if the 100(1 — a)% CI (12.6) 
is computed both for x = v and x = w to obtain joint Cls for jy.,, and py.,,, then the 
joint confidence level on the resulting pair of intervals is at least 100(1 — 2a)%. 
In particular, using a = .05 results in a joint confidence level of at least 90%, 
whereas using a = .01 results in at least 98% confidence. For example, in Example 
12.13 a 95% CI for wy.4; was (12.185, 15.401) and a 95% CI for py.,; was (15.330, 
18.207). The simultaneous or joint confidence level for the two statements 12.185 < 
My.4s < 15.401 and 15.330 < py.35 < 18.207 is at least 90%. 
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The validity of these joint or simultaneous CIs rests on a probability result 
called the Bonferroni inequality, so the joint CIs are referred to as Bonferroni 
intervals. The method is easily generalized to yield joint intervals for k different 
[y.,’s. Using the interval (12.6) separately first for x = x}, then for x = x3,..., and 
finally for x = x; yields a set of k Cls for which the joint or simultaneous confidence 
level is guaranteed to be at least 100(1 — ka)%. 

Tests of hypotheses about B, + 6,x* are based on the test statistic T obtained 
by replacing By + 6,x* in the numerator of (12.5) by the null value zy. For exam- 
ple, Hj: By + B,(45) = 15 in Example 12.13 says that when carbonation depth 
is 45, expected (i.e., true average) strength is 15. The test statistic value is then 
t= [Bo + B,(45) — 15]/sg..6, 45 and the test is upper-, lower-, or two-tailed 
according to the inequality in H,. 


A Prediction Interval for a Future Value of Y 


Rather than calculate an interval estimate for py.,;, an investigator may wish to 
obtain an interval of plausible values for the value of Y associated with some future 
observation when the independent variable has value x*. Consider, for example, 
relating vocabulary size y to age of a child x. The CI (12.6) with x* = 6 would 
provide an estimate of true average vocabulary size for all 6-year-old children. 
Alternatively, we might wish an interval of plausible values for the vocabulary size 
of a particular 6-year-old child. 

A Cl refers to a parameter, or population characteristic, whose value is fixed 
but unknown to us. In contrast, a future value of Y is not a parameter but instead 
a random variable; for this reason we refer to an interval of plausible values for a 
future Y as a prediction interval rather than a confidence interval. The error of 
estimation is By + B\x* — (Bo 2 B.x*), a difference between a fixed (but unknown) 
quantity and a random variable. The error of prediction is Y — (Bo a B.x*), a dif- 
ference between two random variables. There is thus more uncertainty in prediction 
than in estimation, so a PI will be wider than a CI. Because the future value Y is 
independent of the observed Y,’s, 


VLY - (Bo “F Byx*)] = variance of prediction error 


V(X) + ViBy + Byx*) 
ae + of; + “| 
n 


= 2 
fied. OR®) 


Furthermore, because E(Y) = 8, + B.x* and E(B, + B,x*) = By + B.x*, the 
expected value of the prediction error is E(Y — (By + B,x*)) = 0. It can then be 
shown that the standardized variable 


= on (By a B,x*) 


\ 1 (*—xy 
| eee a cae 
n Sie 


has a ¢ distribution with n — 2 df. Substituting this T into the probability statement 
P(~tyjrn—2 < T < tyy2,n-1) = 1 — @ and manipulating to isolate Y between the two 
inequalities yields the following interval. 
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A 100(1 — a)% PI for a future Y observation to be made when x = x* is 


By +B *+ fT s (ee eee 
0 1 = Lyjan-2° S 7 5 


XX 


= Bo as Byx* = a/2,.n—2 : V Xe oF Shy +62" (12.7) 


The interpretation of the prediction level 100(1 — @)% is analogous to that of pre- 
vious confidence levels—if (12.7) is used repeatedly, in the long run the resulting 
intervals will actually contain the observed y values 100(1 — a)% of the time. Notice 
that the 1 underneath the initial square root symbol makes the PI (12.7) wider than 
the CI (12.6), though the intervals are both centered at Bo a B x*. Also, as n > %, 
the width of the CI approaches 0, whereas the width of the PI does not (because even 
with perfect knowledge of B, and f,, there will still be uncertainty in prediction). 


EXAMPLE 12.14  Let’s return to the carbonation depth-strength data of Example 12.13 and calculate a 
95% PI for a strength value that would result from selecting a single core specimen 
whose depth is 45 mm. Relevant quantities from that example are 


y = 13.79 Spy = .7582 Ss = 2.8640 


For a prediction level of 95% based on n — 2 = 16 df, the f critical value is 2.120, 
exactly what we previously used for a 95% confidence level. The prediction interval 
is then 
13.79 + (2.120)V/(2.8640)? + (.7582)? = 13.79 + (2.120)(2.963) 
= 13.79 = 6.28 = (7.51, 20.07) 


Plausible values for a single observation on strength when depth is 45 mm are (at 
the 95% prediction level) between 7.51 MPa and 20.07 MPa. The 95% confidence 
interval for mean strength when depth is 45 was (12.18, 15.40). The prediction 
interval is much wider than this because of the extra (2.8640)* under the square root. 
Figure 12.18, the Minitab output in Example 12.13, shows this interval as well as the 
confidence interval. a 


The Bonferroni technique can be employed as in the case of confidence 


intervals. If a 100(1 — a@)% PI is calculated for each of k different values of x, the 
simultaneous or joint prediction level for all k intervals is at least 100(1 — ka)%. 


EXERCISES Section 12.4 (44—56) 


44. Fitting the simple linear regression model to the n = 27 b. Calculate a confidence interval with a confidence 
observations on x = modulus of elasticity and y = flex- level of 95% for the true average strength of all beams 
ural strength given in Exercise 15 of Section 12.2 resulted whose modulus of elasticity is 40. 
in } = 7.592, sy = .179 when x = 40 and } =9.741, c. Calculate a prediction interval with a prediction level 
Sy = .253 for x = 60. of 95% for the strength of a single beam whose modu- 
a. Explain why sy; is larger when x = 60 than when lus of elasticity is 40. 
x = 40. 
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45. 


46. 


12.4 


d. If a 95% CTI is calculated for true average strength 
when modulus of elasticity is 60, what will be the 
simultaneous confidence level for both this interval 
and the interval calculated in part (b)? 


Reconsider the filtration rate-moisture content data 

introduced in Example 12.6 (see also Example 12.7). 

a. Compute a 90% CI for B) + 1258,, true average 
moisture content when the filtration rate is 125. 

b. Predict the value of moisture content for a single 
experimental run in which the filtration rate is 125 
using a 90% prediction level. How does this interval 
compare to the interval of part (a)? Why is this the 
case? 

c. How would the intervals of parts (a) and (b) compare 
to a CI and PI when filtration rate is 115? Answer 
without actually calculating these new intervals. 

d. Interpret the hypotheses H): By) + 1258, = 80 and 
H,: By + 1258, < 80, and then carry out a test at 
significance level .O1. 


Astringency is the quality in a wine that makes the wine 
drinker’s mouth feel slightly rough, dry, and puckery. The 
paper “Analysis of Tannins in Red Wine Using Multiple 
Methods: Correlation with Perceived Astringency” 
(Amer. J. of Enol. and Vitic., 2006: 481-485) reported 
on an investigation to assess the relationship between 
perceived astringency and tannin concentration using 
various analytic methods. Here is data provided by the 
authors on x = tannin concentration by protein precipita- 
tion and y = perceived astringency as determined by a 
panel of tasters. 


718 .808 .924 1.000 667 529 514.559 


428 480 493 .978 318  .298 —.224 .198 


766 470  .726 .762 .666 562 .378 .779 


326 —.336 = =.765—.190 .066 —.221 —.898 .836 


674 .858 406 .927 311.319 518.687 


126 
907 


305 
638 


= 577 
234 


.779 
781 


.707 
326 


.610 
433 


648 
319 


145 
.238 


1.007 —.090 —1.132 .538 —1.098 —.581 —.862 —.551 


Relevant summary quantities are as follows: 
Dx; = 19.404, Sy, = —.549, Six? = 13.248032, 
D'y? = 11.835795, Six,y, = 3.497811 


S,, = 13.248032 — (19.404)2/32 = 1.48193150, 
S,y = 1182637622 


Sy = 3.497811 — (19.404)(—.549)/32 
= 3.83071088 


a. Fit the simple linear regression model to this data. 
Then determine the proportion of observed variation in 
astringency that can be attributed to the model relation- 
ship between astringency and tannin concentration. 


Inferences Concerning jy.,» and the Prediction of Future Y Values 


47. 


*| 23 
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b. Calculate and interpret a confidence interval for the 
slope of the true regression line. 

c. Estimate true average astringency when tannin con- 
centration is .6, and do so in a way that conveys 
information about reliability and precision. 

d. Predict astringency for a single wine sample whose 
tannin concentration is .6, and do so in a way that 
conveys information about reliability and precision. 

e. Does it appear that true average astringency for a 
tannin concentration of .7 is something other than 0? 
State and test the appropriate hypotheses. 


The simple linear regression model provides a very good 

fit to the data on rainfall and runoff volume given in 

Exercise 16 of Section 12.2. The equation of the least 

squares line is y = —1.128 + .82697x,r? = .975, and 

s = 5.24, 

a. Use the fact that sy = 1.44 when rainfall volume is 
40 m3 to predict runoff in a way that conveys infor- 
mation about reliability and precision. Does the 
resulting interval suggest that precise information 
about the value of runoff for this future observation 
is available? Explain your reasoning. 

b. Calculate a PI for runoff when rainfall is 50 using the 
same prediction level as in part (a). What can be said 
about the simultaneous prediction level for the two 
intervals you have calculated? 


The catch basin in a storm-sewer system is the interface 
between surface runoff and the sewer. The catch-basin insert 
is a device for retrofitting catch basins to improve pollutant- 
removal properties. The article “An Evaluation of the 
Urban Stormwater Pollutant Removal Efficiency of 
Catch Basin Inserts” (Water Envir. Res., 2005: 500-510) 
reported on tests of various inserts under controlled condi- 
tions for which inflow is close to what can be expected in the 
field. Consider the following data, read from a graph in the 
article, for one particular type of insert on x = amount filtered 
(1000s of liters) and y = % total suspended solids removed. 


45 68 91 114 136 159 182 205 228 


y | 53.3 26.9 54.8 33.8 29.9 8.2 17.2 12.2 3.2 11.1 


Summary quantities are 

Dx; = 1251, Dix? = 199,365, Diy, = 250.6, 

D!y? = 9249.36, Dixy; = 21,904.4 

a. Does a scatterplot support the choice of the simple 
linear regression model? Explain. 

Obtain the equation of the least squares line. 

c. What proportion of observed variation in % removed 
can be attributed to the model relationship? 

d. Does the simple linear regression model specify a 
useful relationship? Carry out an appropriate test of 
hypotheses using a significance level of .05. 

e. Is there strong evidence for concluding that there is at 
least a 2% decrease in true average suspended solid 
removal associated with a 10,000 liter increase in the 
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49. 


50. 


amount filtered? Test appropriate hypotheses using 
a = .05. 

f. Calculate and interpret a 95% CI for true average % 
removed when amount filtered is 100,000 liters. How 
does this interval compare in width to a CI when 
amount filtered is 200,000 liters? 

g. Calculate and interpret a 95% PI for % removed 
when amount filtered is 100,000 liters. How does 
this interval compare in width to the CI calculated in 
(f) and to a PI when amount filtered is 200,000 liters? 


You are told that a 95% CI for expected lead content 
when traffic flow is 15, based on a sample of n = 10 
observations, is (462.1, 597.7). Calculate a CI with con- 
fidence level 99% for expected lead content when traffic 
flow is 15. 


Silicon-germanium alloys have been used in certain types of 
solar cells. The paper ‘Silicon-Germanium Films 
Deposited by Low-Frequency Plasma-Enhanced 
Chemical Vapor Deposition” (J. of Material Res., 2006: 
88-104) reported on a study of various structural and electri- 
cal properties. Consider the accompanying data on x = Ge 
concentration in solid phase (ranging from 0 to 1) and y = 
Fermi level position (eV): 


x | 0 42 .23 33 .62 .60 45 87 90 .79 1 1 1 


9 62 53 .61 59 50 55 59 .31 43 46 .23 .22 19 


A scatterplot shows a substantial linear relationship. 
Here is Minitab output from a least squares fit. [Note: 
There are several inconsistencies between the data given 
in the paper, the plot that appears there, and the summary 
information about a regression analysis. ] 


The regression equation is 
Fermi pos = 0.7217 — 0.4327 Ge conc 


S = 0.0737573 


R-Sq = 80.2% R-Sq(adj) = 78.4% 


Analysis of Variance 


Source DF SS MS F P 
Regression 1 0.241728 0.241728 44.43 0.000 
Error 11 0.059842 0.005440 

Total 12 0.301569 


51. 


a. Obtain an interval estimate of the expected change in 
Fermi-level position associated with an increase of .1 
in Ge concentration, and interpret your estimate. 

b. Obtain an interval estimate for mean Fermi-level 
position when concentration is .50, and interpret 
your estimate. 

c. Obtain an interval of plausible values for position 
resulting from a single observation to be made when 
concentration is .50, interpret your interval, and 
compare to the interval of (b). 

d. Obtain simultaneous Cls for expected position when 
concentration is .3, .5, and .7; the joint confidence 
level should be at least 97%. 


Refer to Example 12.12 in which x = test track speed 
and y = rolling test speed. 
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a. Minitab gave 59 4 (45) = -120 and sg 6 (47) = 186. 
Why is the former estimated standard deviation 
smaller than the latter one? 

b. Use the Minitab output from the example to calcu- 
late a 95% CI for expected rolling speed when test 
speed = 45. 

c. Use the Minitab output to calculate a 95% PI for a 
single value of rolling speed when test speed = 47. 


Plasma etching is essential to the fine-line pattern transfer 
in semiconductor processes. The article “Ion Beam- 
Assisted Etching of Aluminum with Chlorine” (J. of 
the Electrochem. Soc., 1985: 2010-2012) gives the 
accompanying data (read from a graph) on chlorine flow 
(x, in SCCM) through a nozzle used in the etching mech- 
anism and etch rate (y, in 100 A/min). 


x| 15 15 20 25 25 3:0 35 35 -40 


y | 23.0 24.5 25.0 30.0 33.5 40.0 40.5 47.0 49.0 


53. 


54. 


The summary statistics are 2x, = 24.0, Dy, = 312.5, 
=x? = 70.50, Uxy; = 902.25, Dy? = 11,626.75, By = 
6.448718, B, = 10.602564. 


a. Does the simple linear regression model specify a 
useful relationship between chlorine flow and etch 
rate? 

b. Estimate the true average change in etch rate associ- 
ated with a 1-SCCM increase in flow rate using a 
95% confidence interval, and interpret the interval. 


c. Calculate a 95% CI for y.,9, the true average etch 
rate when flow = 3.0. Has this average been pre- 
cisely estimated? 

d. Calculate a 95% PI for a single future observation on 
etch rate to be made when flow = 3.0. Is the predic- 
tion likely to be accurate? 

e. Would the 95% Cl and PI when flow = 2.5 be wider 
or narrower than the corresponding intervals of parts 
(c) and (d)? Answer without actually computing the 
intervals. 

f. Would you recommend calculating a 95% PI for a 
flow of 6.0? Explain. 


Consider the following four intervals based on the data 
of Exercise 12.17 (Section 12.2): 

a. A 95% CI for mean porosity when unit weight is 110 
b. A 95% PI for porosity when unit weight is 110 

c. A 95% CI for mean porosity when unit weight is 115 
d. A 95% PI for porosity when unit weight is 115 


Without computing any of these intervals, what can be 
said about their widths relative to one another? 


The height of a patient is useful for a variety of medical 
purposes, such as estimating tidal volume of someone in 
an intensive care who requires artificial ventilation. 
However, it can be difficult to make an accurate 


measurement if the patient is confused, unconscious, or 
sedated. And measurement of height while an individual 
is lying down is also not straightforward. In contrast, 
ulna length measurements are generally quick and easy 
to obtain, even in chair- or bed-bound patients. The 
accompanying data on x = ulna length (cm) and y = 
height (cm) for males older than 65 was read from a 
graph in the article “Ulna Length to Predict Height in 
English and Portuguese Patient Populations” 
(European J. of Clinical Nutr., 2012: 209-215). 


x |22.55 22.8 22.8 23.3 23.3 244 25.0 
y |158 155 156 160 161 162 = 164 


x |25.0 25.0 25.0 260 26.0 26.8 28.2 
y |166 167 170 166 173 #178 174 


Summary quantities include Xx, = 346.1, 2y, = 2310, 
Si. = 36.463571, S,,, = 137.60, S,,, = 626.00. 

a. Obtain the equation of the estimated regression line 
and interpret its slope. 

Calculate and interpret the coefficient of determination. 


o> 


Carry out a test of model utility. 

d. Calculate prediction intervals for the heights of two 
individuals whose ulna lengths are 23 and 25, 
respectively; use a prediction level of 95% for each 
interval. 

e. Based on the predictions of (d), would you agree 

with the statement in the cited article that “height can 

be predicted from ulna length with precision”? 


55. Verify that V(B, + B,x) is indeed given by the expression 
in the text. [Hint: V(=d,Y,) = Xd? - V(Y,).] 
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56. The article “Bone Density and Insertion Torque as 
Predictors of Anterior Cruciate Ligament Graft 
Fixation Strength” (The Amer. J. of Sports Med., 2004: 
1421-1429) gave the accompanying data on maximum 
insertion torque (N - m) and yield load (N), the latter being 
one measure of graft strength, for 15 different specimens. 


Torque | 1.8 2.2 19 13 21 22 16 2.1 
Load 491 477 598 361 605 671 466 431 


Torque | 1.2 18 26 25 25 1.7 1.6 
Load 384 422 554 577 642 348 446 


a. Is it plausible that yield load is normally distributed? 

b. Estimate true average yield load by calculating a 
confidence interval with a confidence level of 95%, 
and interpret the interval. 

c. Here is output from Minitab for the regression of 
yield load on torque. Does the simple linear regres- 
sion model specify a useful relationship between the 


variables? 
Predictor Coef SE Coef 7 Pp 
Constant 152.44 Ol Ly 1.67 0.118 
Torque 178.23 45.97 3.88 0.002 
S=73.2141 R-Sq=53.6% R-Sq(adj) =50.0% 
Source DF SS MS F P 
Regression 1 80554 80554 15.03 0.002 
Residual Error 13 69684 5360 
Total 14 150238 


d. The authors of the cited paper state, “Consequently, we 
cannot but conclude that simple regression analysis- 
based methods are not clinically sufficient to predict 
individual fixation strength’? Do you agree? [Hint: 
Consider predicting yield load when torque is 2.0.] 


12.5 Correlation 


There are many situations in which the objective in studying the joint behavior of 
two variables is to see whether they are related, rather than to use one to predict the 
value of the other. In this section, we first develop the sample correlation coefficient 
ras a measure of how strongly related two variables x and y are in a sample and then 
relate r to the correlation coefficient p defined in Chapter 5. 


The Sample Correlation Coefficient r 


Given n numerical pairs (x ,, y,), (x, Y3),-.., (%,,y,), it is natural to speak of x and y as 
having a positive relationship if large x’s are paired with large y’s and small x’s with 
small y’s. Similarly, if large x’s are paired with small y’s and small x’s with large y’s, 
then a negative relationship between the variables is implied. Consider the quantity 


n 


ay = SG; ~ x)(y; —y)= Peay 


i=1 


., Ee) 


i=1 n 
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Then if the relationship is strongly positive, an x, above the mean x will tend to be 
paired with a y, above the mean y, so that (x; — x)(y; — y) > 0, and this product will 
also be positive whenever both x; and y, are below their respective means. Thus a 
positive relationship implies that S$... will be positive. An analogous argument shows 
that when the relationship is negative, S,,, will be negative, since most of the prod- 
ucts (x; — x)(y; — y) will be negative. This is illustrated in Figure 12.19. 


2 
er a ee 
ie 2 ee ! + 
_ ele 8 . 
eer > ns ode 
Lae = wae 
I — yo Pasha asedsass= Seale aa are aa 
e ee! . a ed 
+ + i ae 
| | @e 
| I 
| I 
1 1 
| I 
1 ! 
x x 
(a) (b) 


Figure 12.19 (a) Scatterplot with S, positive; (b) scatterplot with ee negative 
[+ means (x; — x)(y; — y) > 0, and — means (x; — x)(y; — y) < 0] 


Although S,,, seems a plausible measure of the strength of a relationship, we 
do not yet have any idea of how positive or negative it can be. Unfortunately, S,,, has 
a serious defect: By changing the unit of measurement for either x or y, S,, can be 
made either arbitrarily large in magnitude or arbitrarily close to zero. For example, 
if S\., = 25 when x is measured in meters, then S,, = 25,000 when x is measured in 
millimeters and .025 when x is expressed in kilometers. A reasonable condition to 
impose on any measure of how strongly x and y are related is that the calculated 
measure should not depend on the particular units used to measure them. This condi- 
tion is achieved by modifying S,, to obtain the sample correlation coefficient. 


DEFINITION The sample correlation coefficient for the n pairs (x,, y,),-.., (%,5 Y,) 18 
S S 


xy xy 


'  75G,= 9 V56-9  VSV5, 


(12.8) 


EXAMPLE 12.15 An accurate assessment of soil productivity is critical to rational land-use planning. 
Unfortunately, as the author of the article “Productivity Ratings Based on Soil 
Series” (Prof. Geographer, 1980: 158-163) argues, an acceptable soil productivity 
index is not so easy to come by. One difficulty is that productivity is determined partly 
by which crop is planted, and the relationship between the yield of two different crops 
planted in the same soil may not be very strong. To illustrate, the article presents the 
accompanying data on corn yield x and peanut yield y (mT/Ha) for eight different types 
of soil. 


x | 2.4 3.4 4.6 S31 2.2 3.3 4.0 2.1 


y | 1.33 2.12 1.80 1.65 2.00 1.76 2.11 1.63 
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With Sx, = 25.7, Sy, = 14.40, Ux? = 88.31, Dx,y, = 46.856, and Sy? = 26.4324, 


(25.7)? (14.40) 
S.. = 88.31 : 3.73.35, = 264324 = = eo = 5124 
(25.7)(14.40) 
S,y = 46.856 8 = 5960 
5960 
from which r= = 347 a 
V5.75V 5124 


Properties of r 
The most important properties of r are as follows: 


1. The value of r does not depend on which of the two variables under study is 
labeled x and which is labeled y. 


2. The value of r is independent of the units in which x and y are measured. 
3. -lsrsl 


4. r = 1 if and only if (iff) all (x, y,) pairs lie on a straight line with positive slope, 
and r = —1 iff all (x;, y,) pairs lie on a straight line with negative slope. 

5. The square of the sample correlation coefficient gives the value of the coeffi- 
cient of determination that would result from fitting the simple linear regression 
model—in symbols, (r)* = r?. 


Property 1 stands in marked contrast to what happens in regression analysis, 
where virtually all quantities of interest (the estimated slope, estimated y-intercept, 
s’, etc.) depend on which of the two variables is treated as the dependent variable. 
However, Property 5 shows that the proportion of variation in the dependent variable 
explained by fitting the simple linear regression model does not depend on which 
variable plays this role. 

Property 2 is equivalent to saying that r is unchanged if each x; is replaced by 
cx, and if each y; is replaced by dy; (a change in the scale of measurement), as well 
as if each x; is replaced by x; — a and y, by y, — b (which changes the location of 
zero on the measurement axis). This implies, for example, that r is the same whether 
temperature is measured in °F or °C. 

Property 3 tells us that the maximum value of r, corresponding to the largest pos- 
sible degree of positive relationship, is r = 1, whereas the most negative relationship 
is identified with r = —1. According to Property 4, the largest positive and largest 
negative correlations are achieved only when all points lie along a straight line. Any 
other configuration of points, even if the configuration suggests a deterministic rela- 
tionship between variables, will yield an r value less than | in absolute magnitude. 
Thus r measures the degree of linear relationship among variables. A value of r near 
0 is not evidence of the lack of a strong relationship, but only the absence of a linear 
relation, so that such a value of r must be interpreted with caution. Figure 12.20 
illustrates several configurations of points associated with different values of r. 

A frequently asked question is, “When can it be said that there is a strong 
correlation between the variables, and when is the correlation weak?” Here is an 
informal rule of thumb for characterizing the value of r: 


Weak Moderate Strong 
-—5srs either —.8<r<—5or.5<r<.8 either r= .8 orrS —.8 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


530 CHAPTER 12 Simple Linear Regression and Correlation 


e 
e e 
e e 
e e° ra e 
° 6 7 ee 
e 
e 
(a) rnear +1 (b) rnear —1 
e > e e 
ee me . e ? Ae 
- oe 2° e e 
e e 
(c) rnear 0, no (d) rnear 0, nonlinear 
apparent relationship relationship 


Figure 12.20 Data plots for different values of r 


It may surprise you that an r as substantial as .5 or —.5 goes in the weak category. 
The rationale is that if r = .5 or —.5, then r* = .25 in a regression with either vari- 
able playing the role of y. A regression model that explains at most 25% of observed 
variation is not in fact very impressive. In Example 12.15, the correlation between 
corn yield and peanut yield would be described as weak. 


Inferences About the Population 
Correlation Coefficient 


The correlation coefficient r is a measure of how strongly related x and y are in the 
observed sample. We can think of the pairs (x;, y,) as having been drawn from a 
bivariate population of pairs, with (X,, Y,) having some joint pmf or pdf. In Chapter 
5, we defined the correlation coefficient p(X, Y) by 


Cov(X, Y) 


Oy * Oy 


p = p(X, Y) = 
where 


> >G — My)y — By)p(x, y) (X, Y) discrete 
cov (X, Y) = xy 


| | (x — wy)(y — wy) fC y) dedy (X,Y) continuous 


—c 


If we think of p(x, y) or f(x, y) as describing the distribution of pairs of values 
within the entire population, p becomes a measure of how strongly related x and 
y are in that population. Properties of p analogous to those for r were given in 
Chapter 5. 

The population correlation coefficient p is a parameter or population charac- 
teristic, just as Wy, My, Oy, and oy are, so we can use the sample correlation coef- 
ficient to make various inferences about p. In particular, 7 is a point estimate for p, 
and the corresponding estimator is 


_ ea DY=-7) 
V>@, - xP VDC, - ¥P 
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EXAMPLE 12.16 Medical researchers have noted that adolescent females are much more likely to deliver 
low-birth-weight babies than are adult females. Because such babies have higher mor- 
tality rates, numerous investigations have focused on the relationship between mother’s 
age and birth weight. One such study is described in the article “Body Size and 
Intelligence in 6-Year-Olds: Are Offspring of Teenage Mothers at Risk? (Maternal 
and Child Health J., 2009: 847-856). The following data on x = maternal age (yr) and 
y = baby’s birth weight (g) is consistent with summary quantities given in the cited 
article as well as with data published by the National Center for Health Statistics. 


x | 15 17 18 15 16 19 17 16 18 19 
y | 2289 3393 3271 2648 2897 3327 2970 2535 3138 3573 


A scatterplot of the data shows a rather substantial increasing linear pattern. 
Relevant summary quantities are 2x, = 170, Dy, = 30,041, 2x? = 3910, Ly? = 
91,785,351, x,y, = 515,600, from which S,. = 20, S,,, = 1,539,182.90, S,, = 4903. 
Then 


4903 
a = 
V20 V1,539, 182.90 


With p denoting the correlation between mother’s age and baby’s weight in the 
entire population of adolescent mothers who gave birth, the point estimate of p is 
p=r= 884. a 


The small-sample intervals and test procedures presented in Chapters 7-9 
were based on an assumption of population normality. To test hypotheses about p, an 
analogous assumption about the distribution of pairs of (x, y) values in the popula- 
tion is required. We are now assuming that both X and Y are random (much of our 
regression work focused thus far on x fixed by the experimenter) with a bivariate 
normal probability distribution as described in Section 5.2. Recall that in this case, 
p = O implies that X and Y are independent rv’s. 

Assuming that the pairs are drawn from a bivariate normal distribution allows us 
to test hypotheses about p and to construct a CI. There is no completely satisfactory way 
to check the plausibility of the bivariate normality assumption. A partial check involves 
constructing two separate normal probability plots, one for the sample x,’s and another 
for the sample y,’s, since bivariate normality implies that the marginal distributions 
of both X and Y are normal. If either plot deviates substantially from a straight-line 
pattern, the following inferential procedures should not be used for small n. 


Testing for the Absence of Correlation 


Let R denote the sample correlation coefficient as a random variable (before 
data is obtained). When H,: p = 0 is true, the test statistic 


TKN 2 
V1 — R? 


has at distribution with n — 2 df. 


Alternative Hypothesis P-Value Determination 


jele iy =O Area under the ¢,_, curve to the right of ¢ 
ep 0) Area under the ¢,_, curve to the left of ¢ 
H,:p #0 2-(Area under the f,,_ , curve to the right of I rl) 
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EXAMPLE 12.17  Neurotoxic effects of manganese are well known and are usually caused by high 
occupational exposure over long periods of time. In the fields of occupational 
hygiene and environmental hygiene, the relationship between lipid peroxidation 
(which is responsible for deterioration of foods and damage to live tissue) and occu- 
pational exposure has not been previously reported. The article “‘Lipid Peroxidation 
in Workers Exposed to Manganese” (Scand. J. of Work and Environ. Health, 
1996: 381-386) gives data on x = manganese concentration in blood (ppb) and y = 
concentration (mmol/L) of malondialdehyde, which is a stable product of lipid per- 
oxidation, both for a sample of 22 workers exposed to manganese and for a control 
sample of 45 individuals. The value of r for the control sample is .29, from which 


te (29)V45 = 2 
4/1 — (.29" 


The corresponding P-value for a two-tailed ¢ test based on 43 df is roughly .052 
(the cited article reported only that P-value > .05). We would not want to reject the 
assertion that p = 0 at either significance level .01 or .05. For the sample of exposed 
workers, r = .83 and t ~ 6.7, clear evidence that there is a linear association in the 
entire population of exposed workers from which the sample was selected. ia 


2.0 


Because p measures the extent to which there is a linear relationship between 
the two variables in the population, the null hypothesis H,: p = 0 states that there 
is no such population relationship. In Section 12.3, we used the f ratio B,/ 5g, to 
test for a linear relationship between the two variables in the context of regression 
analysis. It turns out that the two test procedures are completely equivalent because 
rV/n-2/V1-Pr= Bile When interest lies only in assessing the strength of 
any linear relationship rather than in fitting a model and using it to estimate or pre- 
dict, the test statistic formula just presented requires fewer computations than does 
the f-ratio. 


Other Inferences Concerning p 


The procedure for testing Hy: p = py when py # 0 is not equivalent to any procedure 
from regression analysis. The test statistic as well as a confidence interval formula 
are based on a transformation of R developed by the famous statistician R.A. Fisher. 


PROPOSITION Wie hs XG pei) seen 
the rv 


Y,) is a sample from a bivariate normal distribution, 


1 ( 1+ 4 
V=-—lIn (12.9) 


has approximately a normal distribution with mean and variance 


L Ilse fo 1 
= =l)|| —— = —— 
vo =o eee 


1? 


The rationale for the transformation is to obtain a function of R that has a variance 
independent of p; this would not be the case with R itself. Also, the transformation 
should not be used if 7 is quite small, since the approximation will not be valid. 
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12.5 Correlation 533 


The test statistic for testing H): p = py is 


1 
Vee els + po)/(1 — po)] 
Ne 


P-Value Determination 


= 


Alternative Hypothesis 


Hp 1p, Area under the standard normal curve to the 
right of z 

EE p= py Area under the standard normal curve to the 
left of z 

Hi pF po 2 - (Area under the standard normal curve to 


the right of Izl) 


The article “Size Effect in Shear Strength of Large Beams—Behavior and Finite 
Element Modelling” (Mag. of Concrete Res., 2005: 497-509) reported on a study 
of various characteristics of large reinforced concrete deep and shallow beams tested 
until failure. Consider the following data on x = cube strength and y = cylinder 
strength (both in MPa): 


x | 55.10 44.83 46.32 51.10 49.89 45.20 48.18 46.70 54.31 41.50 
y | 49.10 31.20 32.80 42.60 42.50 32.70 36.21 40.40 37.42 30.80 
x | 47.50 52.00 52.25 50.86 51.66 54.77 57.06 57.84 55.22 
y | 35.34 44.80 41.75 39.35 44.07 43.40 45.30 39.08 41.89 


Then S$, = 367.74, Bi = 488.54, and Sry = 322.37, from which r= .761. Does 
this provide strong evidence for concluding that the two measures of strength are at 
least moderately positively correlated? 


Our previous interpretation of moderate positive correlation was .5 < p < .8, so we 
wish to test Hy: p = .5 versus H,: p > .5. The computed value of V is then 


1+ .761 1+.5 
v=.5 at = .999 5S nf = 549 


1 — .761 l= 5 


Thus z = (.999 — .549)\V’/19 — 3 = 1.80. The P-value for this upper-tailed test is 
1 — ®(1.80) = .0359. The null hypothesis can therefore be rejected at significance 
level .05 but not at level .01. This latter result is somewhat surprising in light of the 
magnitude of r, but when n is small, a reasonably large r may result even when p is 
not all that substantial. At significance level .01, the evidence for a moderately posi- 
tive correlation is not compelling. | 


To obtain a CI for p, we first derive an interval for py = 5 In{(1 + p)/(1 — p)]. 
Standardizing V, writing a probability statement, and manipulating the resulting 
inequalities yields 


ee (12.10) 
v »vV . 
Vn — 3 Vn— 3 
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as a 10001 — a)% interval for wy, where v = SIn{(1 + r)/(1 — r)]. This interval 
can then be manipulated to yield the desired CI. 


A 100(1 — a)% confidence interval for p is 
EC = jl (P= , 
a+ eat] 
where c, and c, are the left and right endpoints, respectively, of the interval 
2a): 


EXAMPLE 12.19 As far back as Leonardo da Vinci, it was known that x = height and y = wingspan 
(measured fingertip to fingertip while arms are outstretched side to side) are closely 


related. Here are measurements from a random sample of students taking a statistics 


course: 

x 63.0 63.0 65.0 64.0 68.0 69.0 71.0 68.0 
y 62.0 62.0 64.0 64.5, 67.0 69.0 70.0 72.0 
x 68.0 72.0 73.0 73.5, 70.0 70.0 72.0 74.0 
y 70.0 72.0 73.0 75.0 71.0 70.0 76.0 76.5, 


A scatterplot shows an approximate linear pattern, and so do normal probability plots 
of x and y. The sample correlation coefficient is computed to be r = .9422. Its Fisher 
transformation is 


a2 
wh 7 gags) 
A 95% Cl for py is 
1.757 + 2 = (1.213, 2301 =) 
157 = = (1.213,2. se 


The CI for p with a confidence level of approximately 95% is therefore 


(= = | e22301) — 1 


e2(1.213) + 1 22.301) + ; > (838, 980) 


Notice that the interval includes only values exceeding .8, so it appears that there is 
a strong linear association between the two variables in the sampled population. M& 


In Chapter 5, we cautioned that a large value of the correlation coefficient (near 
1 or —1) implies only association and not causation. This applies to both p and r. 


EXERCISES Section 12.5 (57-67) 


57. The article “Behavioural Effects of Mobile Telephone 58. The Turbine Oil Oxidation Test (TOST) and the Rotating 


Use During Simulated Driving” (Ergonomics, 1995: 
2536-2562) reported that for a sample of 20 experimen- 
tal subjects, the sample correlation coefficient for x = 
age and y = time since the subject had acquired a driving 
license (yr) was .97. Why do you think the value of r is 
so close to 1? (The article’s authors give an explanation.) 
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Bomb Oxidation Test (RBOT) are two different proce- 
dures for evaluating the oxidation stability of steam tur- 
bine oils. The article ‘Dependence of Oxidation Stability 
of Steam Turbine Oil on Base Oil Composition” (J. of 
the Society of Tribologists and Lubrication Engrs., Oct. 
1997: 19-24) reported the accompanying observations on 


59, 


60. 


x = TOST time (hr) and y = RBOT time (min) for 12 oil 
specimens. 


TOST | 4200 3600 3750 3675 4050 2770 
RBOT | 370 340 375 310 350 =. 200 
TOST | 4870 4500 3450 2700 3750 3300 
RBOT | 400 375 285 225 345.285 


a. Calculate and interpret the value of the sample cor- 
relation coefficient (as do the article’s authors). 

b. How would the value of r be affected if we had let 
x = RBOT time and y = TOST time? 

c. How would the value of r be affected if RBOT time 
were expressed in hours? 
Construct normal probability plots and comment. 

e. Carry out a test of hypotheses to decide whether 
RBOT time and TOST time are linearly related. 


Toughness and fibrousness of asparagus are major determi- 
nants of quality. This was the focus of a study reported in 
“Post-Harvest Glyphosphate Application Reduces 
Toughening, Fiber Content, and Lignification of Stored 
Asparagus Spears” (J. of the Amer. Soc. of Hort. Science, 
1988: 569-572). The article reported the accompanying 
data (read from a graph) on x = shear force (kg) and y = 
percent fiber dry weight. 


x| 46 48 55 57 60 72 81 85 94 


y!2.18 2.10 2.13 2.28 2.34 2.53 2.28 2.62 2.63 


x| 109 121 132 137 148 149 184 185 187 


y!2.50 2.66 2.79 2.80 3.01 2.98 3.34 3.49 3.26 


n= 18, x, = 1950, Sx? = 251,970, 
Yy, = 47.92, Dy? = 130.6074, Ex,y, = 5530.92 


a. Calculate the value of the sample correlation coeffi- 
cient. Based on this value, how would you describe 
the nature of the relationship between the two vari- 
ables? 

b. If a first specimen has a larger value of shear force 
than does a second specimen, what tends to be true 
of percent dry fiber weight for the two specimens? 

c. If shear force is expressed in pounds, what happens 
to the value of r? Why? 

d. If the simple linear regression model were fit to this 
data, what proportion of observed variation in percent 
fiber dry weight could be explained by the model 
relationship? 

e. Carry out a test at significance level .01 to decide 
whether there is a positive linear association between 
the two variables. 


Head movement evaluations are important because indi- 
viduals, especially those who are disabled, may be able to 
operate communications aids in this manner. The article 
“Constancy of Head Turning Recorded in Healthy 


61. 


12.5 Correlation 535 


Young Humans” (J. of Biomed. Engr., 2008: 428-436) 
reported data on ranges in maximum inclination angles of 
the head in the clockwise anterior, posterior, right, and left 
directions for 14 randomly selected subjects. Consider the 
accompanying data on average anterior maximum inclina- 
tion angle (AMIA) both in the clockwise direction and in 
the counterclockwise direction. 


Subj: 1 2 3 4 > 6 7 
Cl: 57.9 35.7 545 56.8 51.1 70.8 77.3 
Co: 44.2 52.1 60.2 52.7 47.2 65.6 714 


Subj: 8 9 10 11 12 13 14 
Cl: 51.6 54.7 63.6 59.2 59.2 55.8 38.5 
Co: 48.8 53.1 66.3 59.8 47.5 64.5 34.5 


a. Calculate a point estimate of the population correla- 
tion coefficient between Cl AMIA and Co AMIA 
(2Cl = 786.7, =Co = 767.9, =CP = 45,727.31, 
Co? =43,478.07, SCICo = 44,187.87). 

b. Assuming bivariate normality (normal probability 
plots of the Cl and Co samples are reasonably 
straight), carry out a test at significance level .01 to 
decide whether there is a linear association between 
the two variables in the population (as do the authors 
of the cited paper). Would the conclusion have been 
the same if a significance level of .001 had been 
used? 


The authors of the paper “Objective Effects of a Six 
Months’ Endurance and Strength Training Program in 
Outpatients with Congestive Heart Failure” (Medicine 
and Science in Sports and Exercise, 1999: 1102-1107) 
presented a correlation analysis to investigate the relation- 
ship between maximal lactate level x and muscular endur- 
ance y. The accompanying data was read from a plot in the 


paper. 


x | 400 750 770 800 850 1025 1200 
y | 3.80 4.00 4.90 5.20 4.00 3.50 6.30 
1300 1400 1475 1480 1505 2200 


x | 1250 


y | 6.88 7.55 4.95 7.80 445 6.60 8.90 


S., = 36.9839, S\,, = 2,628,930.357, S,., = 7377.704. A 

scatterplot shows a linear pattern. , 

a. Test to see whether there is a positive correlation be- 
tween maximal lactate level and muscular endurance 
in the population from which this data was selected. 

b. If a regression analysis were to be carried out to 
predict endurance from lactate level, what propor- 
tion of observed variation in endurance could be 
attributed to the approximate linear relationship? 
Answer the analogous question if regression is used 
to predict lactate level from endurance—and 
answer both questions without doing any regres- 
sion calculations. 
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62. 


63. 


64. 


The article ‘Quantitative Estimation of Clay 
Mineralogy in Fine-Grained Soils” (J. of Geotechnical 
and Geoenvironmental Engr., 2011: 997-1008) report- 
ed on various chemical properties of natural and artificial 
soils. Here are observations on x = cation exchange 
capacity (CEC, in meq/100 g) and y = specific surface 
area (SSA, in m?/g) of 20 natural soils. 


x | 66 121 134 101 77 89 63 57 117 118 


y | 175 324 460 288 205 210 295 161 314 265 
x | 76 125 75 71 133 104 76 96 58 109 
y | 236 355 240 133 431 306 132 269 158 303 


Minitab gave the following output in response to a 
request for r: 


correlation of x and y = 0.853 


Normal probability plots of x and y are quite straight. 

a. Carry out a test of hypotheses to see if there is a 
positive linear association in the population from 
which the sample data was selected. 

b. With n = 20, how small would the value of r have to 
be in order for the null hypothesis in the test of (a) to 
not be rejected at significance level .01? 

c. Calculate a confidence interval for p using a 95% 
confidence level. 


Physical properties of six flame-retardant fabric samples 
were investigated in the article “‘Sensory and Physical 
Properties of Inherently Flame-Retardant Fabrics” 
(Textile Research, 1984: 61-68). Use the accompanying 
data and a .05 significance level to determine whether a lin- 
ear relationship exists between stiffness x (mg-cm) and 
thickness y (mm). Is the result of the test surprising in light 
of the value of r? 


% | 7.98 24.52 12.47 692 24.11 35.71 


y | 28 .65 32 27 81 57 


The accompanying data on x = UV transparency index 
and y = maximum prevalence of infection was read from 
a graph in the article “Solar Radiation Decreases 
Parasitism in Daphnia” (Ecology Letters, 2012: 
47-54): 

x {13 14 15 20 22 27 2.7 27 28 
yl 3 32 1 13 0 8 6 2 

x {2.9 30 36 38 38 46 51 57 


yl! 1 7 36 25 10 35 58 56 


Summary quantities include S,, = 
5593.0588, and S,,, = 264.4882. 


25.5224, S, = 


a. Calculate and interpret the value of the sample cor- 
relation coefficient. 

b. If you decided to fit the simple linear regression 
model to this data, what proportion of observed 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


65. 


66. 


variation in maximum prevalence could be explained 
by the model relationship? 

c. If you decided to regress UV transparency index on 
maximum prevalence (i.e., interchange the roles of x 
and y), what proportion of observed variation could 
be attributed to the model relationship? 

d. Carry out a test of Hy: p = .5 versus H,: p > .5 using a 
significance level of .05. [Note: The cited article report- 
ed the P-value for testing Hy: p = 0 versus Hy: p ~ 0.] 


Torsion during hip external rotation and extension may 
explain why acetabular labral tears occur in professional 
athletes. The article “Hip Rotational Velocities During the 
Full Golf Swing” (J. of Sports Science and Med., 2009: 
296-299) reported on an investigation in which lead hip 
internal peak rotational velocity (x) and trailing hip peak 
external rotational velocity (vy) were determined for a sam- 
ple of 15 golfers. Data provided by the article’s authors was 
used to calculate the following summary quantities: 


DG; — 2? = 64,732.83, SO; — yy? = 130,566.96, 
DG; — HO; — y) = 44,185.87 


Separate normal probability plots showed very substan- 

tial linear patterns. 

a. Calculate a point estimate for the population correla- 
tion coefficient. 

b. Carry out a test at significance level .01 to decide 
whether there is a linear relationship between the two 
velocities in the sampled population. 

c. Would the conclusion of (b) have changed if you had 
tested appropriate hypotheses to decide whether 
there is a positive linear association in the popula- 
tion? What if a significance level of .05 rather than 
.O1 had been used? 


Consider a time series—that is, a sequence of observa- 
tions X,, X,,... obtained over time—with observed 
values x,,X,..., x, Suppose that the series shows no 
upward or downward trend over time. An investigator 
will frequently want to know just how strongly values 
in the series separated by a specified number of time 
units are related. The lag-one sample autocorrelation 
coefficient r, is just the value of the sample correlation 
coefficient r for the pairs (x, x5), (>, X3),---5 (®,—15X,,)s 
that is, pairs of values separated by one time unit. 

Similarly, the lag-two sample autocorrelation coefficient 

ry is r for the n — 2 pairs (x, 3), (X, X4)se-0s (X_—25 Xp): 

a. Calculate the values of ,, r,, and r; for the temperature 
data from Exercise 82 of Chapter 1, and comment. 

b. Analogous to the population correlation coefficient p, 
let p,, P,... denote the theoretical or long-run auto- 
correlation coefficients at the various lags. If all these 
p’s are 0, there is no (linear) relationship at any lag. In 
this case, if n is large, each R, has approximately a 
normal distribution with mean 0 and standard devia- 
tion 1/V/n, and different R,’s are almost independent. 


Thus H,: p; = 0 can be tested using a z test with 
test statistic value z, = Var, If n = 100 and r, = .16, 
ry = —.09, and r, = —.15, at significance level .05 is 
there any evidence of theoretical autocorrelation at the 
first three lags? 

c. If you are simultaneously testing the null hypothesis in 
part (b) for more than one lag, why might you want to 
increase the significance level for each test? 


67. A sample of n = 500 (x, y) pairs was collected and a test 
of Hj: p =0 versus H,:p #0 was carried out. The 
resulting P-value was computed to be .00032. 


Supplementary Exercises 537 


a. What conclusion would be appropriate at level of 
significance .001? 

b. Does this small P-value indicate that there is a very 
strong linear relationship between x and y (a value of 
p that differs considerably from 0)? Explain. 

c. Now suppose a sample of n = 10,000 (x, y) pairs 
resulted in r = .022. Test Hp: p = 0 versus H,: p # 0 
at level .05. Is the result statistically significant? 
Comment on the practical significance of your analysis. 


SUPPLEMENTARY EXERCISES (68-87) 


68. The appraisal of a warehouse can appear straightforward 
compared to other appraisal assignments. A warehouse 
appraisal involves comparing a building that is primarily 
an open shell to other such buildings. However, there are 
still a number of warehouse attributes that are plausibly 
related to appraised value. The article ‘““‘Challenges in 
Appraising ‘Simple’ Warehouse Properties” (Donald 
Sonneman, The Appraisal Journal, April 2001, 174— 
178) gives the accompanying data on truss height (ft), 
which determines how high stored goods can be stacked, 
and sale price ($) per square foot. 


Height | 12 14 14 15 15 16 18 22 22 24 


Price 135.53 37.82 36.90 40.00 38.00 37.50 41.00 48.50 47.0047.50 


Height | 24 26 26 27 28 30 30 3336 


Price 146.20 50.35 49.13 48.07 50.90 54.78 54.32 57.17 57.45 


a. Is it the case that truss height and sale price are “deter- 
ministically” related—i.e., that sale price is deter- 
mined completely and uniquely by truss height? 
(Hint: Look at the data. ] 

b. Construct a scatterplot of the data. What does it 
suggest? 

c. Determine the equation of the least squares line. 

d. Give a point prediction of price when truss height is 
27 ft, and calculate the corresponding residual. 

e. What percentage of observed variation in sale price 
can be attributed to the approximate linear relation- 
ship between truss height and price? 


69. Refer to the previous exercise, which gives data on truss 
heights for a sample of warehouses and the correspond- 
ing sale prices. 

a. Estimate the true average change in sale price associ- 
ated with a one-foot increase in truss height, and do 
so in a way that conveys information about the preci- 
sion of estimation. 

b. Estimate the true average sale price for all ware- 
houses having a truss height of 25 ft, and do so ina 


way that conveys information about the precision of 
estimation. 

c. Predict the sale price for a single warehouse whose 
truss height is 25 ft, and do so in a way that conveys 
information about the precision of prediction. How 
does this prediction compare to the estimate of (b)? 

d. Without calculating any intervals, how would the 
width of a 95% prediction interval for sale price when 
truss height is 25 ft compare to the width of a 95% 
interval when height is 30 ft? Explain your reasoning. 

e. Calculate and interpret the sample correlation 
coefficient. 


70. Forensic scientists are often interested in making a mea- 
surement of some sort on a body (alive or dead) and then 
using that as a basis for inferring something about the 
age of the body. Consider the accompanying data on age 
(yr) and % D-aspertic acid (hereafter %DAA) from a 
particular tooth (“An Improved Method for Age at 
Death Determination from the Measurements of 
D-Aspertic Acid in Dental Collagen,” Archaeometry, 
1990: 61-70.) 


Age 9 10 11 12 13 14 33 39 52 65 69 
%DAA \1.13 1.10 1.11 1.10 1.24 1.31 2.25 2.54 2.93 3.40 4.55 


Suppose a tooth from another individual has 2.01%DAA. 
Might it be the case that the individual is younger than 
22? This question was relevant to whether or not the indi- 
vidual could receive a life sentence for murder. 

A seemingly sensible strategy is to regress age 
on %DAA and then compute a PI for age when 
%DAA = 2.01. However, it is more natural here to 
regard age as the independent variable x and %DAA 
as the dependent variable y, so the regression model is 
%DAA = By + Bix + €. After estimating the regres- 
sion coefficients, we can substitute y* = 2.01 into the 
estimated equation and then solve for a prediction of age 
x. This “inverse” use of the regression line is called “‘cali- 
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bration.” A PI for age with prediction level approximately 
10011 — a)% is ¥ ¥ ty). * SE where 


SE a a ay" 
: n S 


XX 


Calculate this PI for y* = 2.01 and then address the 
question posed earlier. 


71. Phenolic compounds are found in the effluents of coal 
conversion processes, petroleum refineries, herbicide 
manufacturing, and fiberglass manufacturing. These com- 
pounds are toxic, carcinogenic, and have contributed over 
the past decades to environmental pollution of aquatic 
environments. In one study reported in ‘“Photolysis, 
Biodegradation, and Sorption Behavior of Three 
Selected Phenolic Compounds on the Surface and 
Sediment of Rivers” (J. of Envir. Engr., 2011: 1114- 
1121), the authors examined the sorption characteristics 
of three selected phenolic compounds. The following data 
on y = sorbed concentration (ug/g) and x = equilibrium 
concentration (j4g/mL) of 2, 4-Dinitrophenol (DNP) in a 
particular natural river sediment was read from a graph in 
the article. 


x | dt .13) 14 18.29 44 67 78 93 
1.72 2.17 2.33 3.00 5.17 7.61 11.17 12.72 14.78 


y 


a. Calculate point estimates of the slope and intercept 
of the population regression line. 

b. Does the simple linear regression model specify a 
useful relationship between y and x? 

c. Confirm that j = 3.404, S; = .107 when x = .2, and 
y = 6.616, Sj = .088 when x = .4. Explain why s> is 
larger when x = .2 than when x = .4. 

d. Calculate a confidence interval with a confidence 
level of 95% for the true average DNP sorbed 


SAS output for Exercise 72 


Dependent Variable: NITRLVL 


72. 


73. 


concentration of all river sediment specimens 
having an equilibrium concentration of .4. 

e. Calculate a prediction interval with a prediction level 
of 95% for the DNP sorbed concentration of a single 
river sediment specimen having an equilibrium con- 
centration of .4. 


The SAS output at the bottom of this page is based on 

data from the article “Evidence for and the Rate of 

Denitrification in the Arabian Sea” (Deep Sea 

Research, 1978: 431-435). The variables under study are 

x = salinity level (%) and y = nitrate level (M/L). 

a. What is the sample size n? [Hint: Look for degrees 
of freedom for SSE.] 

b. Calculate a point estimate of expected nitrate level 
when salinity level is 35.5. 

c. Does there appear to be a useful linear relationship 
between the two variables? 

d. What is the value of the sample correlation coeffi- 
cient? 

e. Would you use the simple linear regression model to 
draw conclusions when the salinity level is 40? 


The presence of hard alloy carbides in high chromium 
white iron alloys results in excellent abrasion resistance, 
making them suitable for materials handling in the mining 
and materials processing industries. The accompanying 
data on x = retained austenite content (%) and y = abrasive 
wear loss (mm*) in pin wear tests with garnet as the abra- 
sive was read from a plot in the article ‘‘Microstructure- 
Property Relationships in High Chromium White Iron 
Alloys” (Intl. Materials Reviews, 1996: 59-82). 


46 17.0 174 18.0 18.5 22.4 26.5 30.0 34.0 


66 92 145 1.03 .70 .73 1.20 80 91 
38.8 48.2 63.5 65.8 73.9 77.2 79.8 84.0 


1.19 1.15 1.12 1.37 1.45 1.50 1.36 1.29 


Analysis of Variance 


Source DF Sum of Squares Mean Square F Value Prob > F 
Model iL 64.49622 64.49622 63.309 0.0002 
Error 6 6.11253 1.01875 
C Total 4 70.60875 

Root MSE 1.00933 R-square 0.9134 
Dep Mean 26.91250 Adj R-sq 0.8990 
c.V. 3.75043 


Parameter Estimates 


Parameter Standard T for HO: 
Variable DF Estimate Error Parameter = 0 Prob > |T| 
INTERCEP 1 326.976038 37.71380243 8.670 0.0001 
SALINITY 1 —8.403964 1.05621381 =7. 957 0.0002 
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SAS output for Exercise 73 


Dependent Variable: ABRLOSS 
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Analysis of Variance 


Source DF Sum of Squares Mean Square F Value Prob >F 
Model al. 0.63690 0.63690 15.444 0.0013 
Error 15 0.61860 0.04124 

Cc Total 16 1.25551 

Root MSE 0.20308 R-square 0.5073 

Dep Mean 1.10765 Adj R-sq 0.4744 

Cavs 18.33410 

Parameter Estimates 
Parameter Standard T for HO: 

Variable DF Estimate Error Parameter = 0 Prob > |T 
INTERCEP HE 0.787218 0.09525879 8.264 0.0001 
AUSTCONT 1 0.007570 0.00192626 3.930 0.0013 


74. 


75. 


Use the data and the SAS output above to answer the 

following questions. 

a. What proportion of observed variation in wear loss 
can be attributed to the simple linear regression 
model relationship? 

b. What is the value of the sample correlation 
coefficient? 

c. Test the utility of the simple linear regression model 
using a = .01. 

d. Estimate the true average wear loss when content is 
50% and do so in a way that conveys information 
about reliability and precision. 

e. What value of wear loss would you predict when 
content is 30%, and what is the value of the corre- 
sponding residual? 


The accompanying data was read from a scatterplot in 
the article “Urban Emissions Measured with 
Aircraft” (J. of the Air and Waste Mgmt. Assoc., 1998: 
16-25). The response variable is ANO,, and the explan- 
atory variable is ACO. 


ACO 50 60 95 108 135 
ANO, | 23 45 40 3.7 82 
ACO 210 214 315 720 
ANO, 5.4 12 13.8 32.1 


a. Fit an appropriate model to the data and judge the 
utility of the model. 

b. Predict the value of ANO, that would result from 
making one more observation when ACO is 400, and 
do so in a way that conveys information about preci- 
sion and reliability. Does it appear that ANO,, can be 
accurately predicted? Explain. , 

c. The largest value of ACO is much greater than the 
other values. Does this observation appear to have 
had a substantial impact on the fitted equation? 


An investigation was carried out to study the relationship 
between speed (ft/sec) and stride rate (number of steps 


76. 


x | 16.52 17.53 18.05 


taken/sec) among female marathon runners. Resulting 

summary quantities included n = 11, =(speed) = 205.4, 

(speed)? = 3880.08, (rate) = 35.16, >(rate)? = 112.681, 

and =(speed)(rate) = 660.130. 

a. Calculate the equation of the least squares line that 
you would use to predict stride rate from speed. 

b. Calculate the equation of the least squares line that 
you would use to predict speed from stride rate. 

c. Calculate the coefficient of determination for the 
regression of stride rate on speed of part (a) and for 
the regression of speed on stride rate of part (b). How 
are these related? 


“Mode-mixity” refers to how much of crack propagation is 
attributable to the three conventional fracture modes of 
opening, sliding, and tearing. For plane problems, only the 
first two modes are present, and the mode-mixity angle is 
a measure of the extent to which propagation is due to slid- 
ing as opposed to opening. The article ‘Increasing 
Allowable Flight Loads by Improved Structural 
Modeling” (AJAA J., 2006: 376-381) gives the following 
data on x = mode-mixity angle (degrees) and y = fracture 
toughness (N/m) for sandwich panels use in aircraft 
construction. 


18.50 22.39 23.89 25.50 24.89 


y 1609.4 443.1 577.9 


x| 23.48 24.98 25.55 


628.7 565.7 711.0 863.4 956.2 


25.90 22.65 23.69 24.15 24.54 


yl 679.5 707.5 767.1 


817.8 702.3 903.7 964.9 1047.3 


a. Obtain the equation of the estimated regression line, 
and discuss the extent to which the simple linear 
regression model is a reasonable way to relate frac- 
ture toughness to mode-mixity angle. 

b. Does the data suggest that the average change in frac- 
ture toughness associated with a one-degree increase 
in mode-mixity angle exceeds 50 N/m? Carry out an 
appropriate test of hypotheses. 
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77. 


78. 


79. 


80. 


c. For purposes of precisely estimating the slope of the 
population regression line, would it have been pref- 
erable to make observations at the angles 16, 16, 18, 
18, 20, 20, 20, 20, 22, 22, 22, 22, 24, 24, 26, and 26 
(again a sample size of 16)? Explain your reasoning. 

d. Calculate an estimate of true average fracture tough- 
ness and also a prediction of fracture toughness both 
for an angle of 18 degrees and for an angle of 22 
degrees, do so in a manner that conveys information 
about reliability and precision, and then interpret and 
compare the estimates and predictions. 


Open water oil spills can wreak terrible consequences on 
the environment and be expensive to clean up. Many 
physical and biological methods have been developed to 
recover oil from water surfaces. The article ““Capacity of 
Straw for Repeated Binding of Crude Oil from Salt 
Water and Its Effect on Biodegradation” (J. of 
Hazardous Toxic and Radioactive Waste, 2012: 75-78) 
discussed how wheat straw could be used to extract crude 
oil from a water surface. An experiment was conducted in 
which crude oil (0 to 16.9g) was added to 100mL of salt- 
water in separate Petri dishes. Wheat straw (2g) was then 
added to each dish and all dishes were shaken at 70 rpm 
overnight. The following data, read from a graph, is based 
on the x = amount of oil added (in g) and y = the corre- 
sponding amount of oil recovered (in g) from wheat straw. 


x | 1.0 1S 2d 2.8 3.6 4.5 55 

y '0.610 0.840 1.512 1.792 2.952 2.880 4.400 
x¥|66 78 91 105 12.0 13.6 15.2 169 
y '5.346 6.396 7.189 8.085 9.840 11.696 13.224 14.365 


a. Construct a scatterplot of the data. Does it appear 
that recovered oil could be very well predicted by the 
value of added oil? Explain your reasoning. 


Calculate and interpret the coefficient of determination. 

c. Does the simple linear regression model appear to 
specify a useful relationship between these two vari- 
ables? State and test the relevant hypotheses. 

d. Predict the value of oil recovered when amount of oil 
added is 5.0, and do so in a way that conveys infor- 
mation about precision and reliability. 

e. Without any further calculation, carry out a test of 
hypotheses to decide whether the value of p is some- 
thing other than 0. 


In Section 12.4, we presented a formula for V(B, + B,x*) 
and a Cl for By + B,x*. Taking x* = 0 gives , and a CI 
for By. Use the data of Example 12.11 to calculate the 
estimated standard deviation of Bo and a 95% CI for the 


y-intercept of the true regression line. 


Show that SSE = S,, — BSiys which gives an alternative 
computational formula for SSE. 


Suppose that x and y are positive variables and that a 
sample of 1 pairs results in r ~ 1. If the sample correla- 
tion coefficient is computed for the (x, y*) pairs, will the 
resulting value also be approximately 1? Explain. 
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82. 


83. 


84. 


Let s, and s, denote the sample standard deviations of the 
observed x’s and y’s, respectively [that is, s? = 
ey eer 2 
=(x; — x)?/(n — 1) and similarly for sy]. 
a. Show that an alternative expression for the estimated 
regression line y = By + B,x is 


Sy 
y=ytr: —(x-x) 

b. This expression for the regression line can be inter- 
preted as follows. Suppose r = .5. What then is the 
predicted y for an x that lies 1 SD (s, units) above the 
mean of the x,’s? If r were 1, the prediction would be 
for y to lie 1 SD above its mean y, but since r = .5, 
we predict a y that is only .5 SD (.5s, unit) above y. 
Using the data in Exercise 64, when UV transparency 
index is 1 SD below the average in the sample, 
by how many standard deviations is the predicted 
maximum prevalence above or below its average for 
the sample? 


Verify that the ¢ statistic for testing Hy: B, = 0 in Section 
12.3 is identical to the ¢ statistic in Section 12.5 for test- 
ing H): p = 0. 

Use the formula for computing SSE to verify that 
r= 1 — SSE/SST. 


In biofiltration of wastewater, air discharged from a treat- 
ment facility is passed through a damp porous membrane 
that causes contaminants to dissolve in water and be 
transformed into harmless products. The accompanying 
data on x = inlet temperature (°C) and y = removal effi- 
ciency (%) was the basis for a scatterplot that appeared in 
the article ‘Treatment of Mixed Hydrogen Sulfide and 
Organic Vapors in a Rock Medium Biofilter” (Water 
Environment Research, 2001: 426-435). 


Obs 


— 
Foo MAmANIANBWNeK 


— 
N 


eee 
new 


16 


Removal 
Temp % 


Removal 
Temp % 


7.68 
6.51 
6.43 
5.48 
6.57 
10.22 
15.69 
16.77 
17.13 
17.63 
16.72 
15.45 
12.06 
11.44 
10.17 
9.64 


98.09 17 
98.25 18 
97.82 19 
97.82 20 
97.82 21 
97.93 22 
98.38 23 
98.89 24 
98.96 25 
98.90 26 
98.68 2 
98.69 28 
98.51 29 
98.09 30 
98.25 31 
98.36 32 


8.55 

7.57 

6.94 

8.32 
10.50 
16.02 
17.83 
17.03 
16.18 
16.26 
14.44 
12.78 
12.25 
11.69 
11.34 
10.97 


98.27 
98.00 
98.09 
98.25 
98.41 
98.51 
98.71 
98.79 
98.87 
98.76 
98.58 
98.73 
98.45 
98.37 
98.36 
98.45 


Calculated summary quantities are Xx, = 384.26, Ly, = 
3149.04, Xx? = 5099.2412, Xx,y, = 37,850.7762, and 
Dy? = 309,892.6548. 


a. Does a scatterplot of the data suggest appropriate- 
ness of the simple linear regression model? 

b. Fit the simple linear regression model, obtain a point 
prediction of removal efficiency when temperature 
= 10.50, and calculate the value of the correspond- 
ing residual. 

c. Roughly what is the size of a typical deviation of 
points in the scatterplot from the least squares line? 

d. What proportion of observed variation in removal 
efficiency can be attributed to the model relationship? 

e. Estimate the slope coefficient in a way that conveys 
information about reliability and precision, and inter- 
pret your estimate. 

f. Personal communication with the authors of the article 
revealed that there was one additional observation that 
was not included in their scatterplot: (6.53, 96.55). 
What impact does this additional observation have on 
the equation of the least squares line and the values of s 
and 1°? 


85. Normal hatchery processes in aquaculture inevitably pro- 
duce stress in fish, which may negatively impact growth, 
reproduction, flesh quality, and susceptibility to disease. 
Such stress manifests itself in elevated and sustained 
corticosteroid levels. The article ‘Evaluation of Simple 
Instruments for the Measurement of Blood Glucose 
and Lactate, and Plasma Protein as Stress Indicators 
in Fish” (J. of the World Aquaculture Society, 1999: 
276-284) described an experiment in which fish were 
subjected to a stress protocol and then removed and tested 
at various times after the protocol had been applied. The 
accompanying data on x = time (min) and y = blood 
glucose level (mmol/L) was read from a plot. 


* | 2 2 5 7 12 13 #17 #18 23 24 26 28 
y 4.0 3.6 3.7 40 3.8 40 5.1 3.9 44 43 43 44 


x| 29 30 34 36 40 41 44 56 56 57 60 60 
y|5.8 43 5.5 5.6 5.1 5.7 61 5.1 5.9 6.8 4.9 5.7 


Use the methods developed in this chapter to analyze 
the data, and write a brief report summarizing your con- 
clusions (assume that the investigators are particularly 
interested in glucose level 30 min after stress). 


86. The article “Evaluating the BOD POD for Assessing 
Body Fat in Collegiate Football Players” (Medicine 
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and Science in Sports and Exercise, 1999: 1350-1356) 
reports on a new air displacement device for measuring 
body fat. The customary procedure utilizes the hydro- 
static weighing device, which measures the percentage 
of body fat by means of water displacement. Here is 
representative data read from a graph in the paper. 


2.5 40 4.1 62 7.1 7.0 83 9.2 9.3 12.0 12.2 


HW 
BOD 


8.0 6.2 9.2 64 8.6 12.2 7.2 12.0 14.9 12.1 15.3 
12.6 142 144 15.1 15.2 163 17.1 17.9 17.9 


HW 


87. 


14.8 143 163 17.9 19.5 17.5 14.3 18.3 16.2 


a. Use various methods to decide whether it is plausible 
that the two techniques measure on average the same 
amount of fat. 

b. Use the data to develop a way of predicting an HW 
measurement from a BOD POD measurement, and 
investigate the effectiveness of such predictions. 


Reconsider the situation of Exercise 73, in which x = 
retained austenite content using a garnet abrasive and y = 
abrasive wear loss were related via the simple linear 
regression model Y = By + B,x + e. Suppose that for a 
second type of abrasive, these variables are also related 
via the simple linear regression model Y = y) + y,x + € 
and that V(e) = o for both types of abrasive. If the data 
set consists of n, observations on the first abrasive and n, 
on the second and if SSE, and SSE, denote the two error 
sums of squares, then a pooled estimate of o? is 
6? = (SSE, + SSE,)/(n, + n, — 4). Let SS,, and SS,, 
denote =(x; — x)? for the data on the first and second 
abrasives, respectively. A test of Hy: B, — y, = 0 (equal 
slopes) is based on the statistic 


By — v1 
x fell 1 
o ar 

SS SS. 


When A) is true, T has at distribution with n, + n, — 4 df. 
Suppose the 15 observations using the alternative abra- 
sive give SS,, = 7152.5578, Y, = .006845, and SSE, = 
.51350. Using this along with the data of Exercise 73, 
carry out a test at level .05 to see whether expected change 
in wear loss associated with a 1% increase in austenite 
content is identical for the two types of abrasive. 


T= 


(5th ed.), Irwin, Homewood, IL, 2005. The first 14 chapters 
constitute an extremely readable and informative survey of 
regression analysis. 
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Nonlinear and Multiple 


Regression 


INTRODUCTION 


The probabilistic model studied in Chapter 12 specified that the observed 
value of the dependent variable Y deviated from the linear regression function 
By.» = Bo + B,x by arandom amount. Here we consider two ways of generalizing 
the simple linear regression model. The first way is to replace B, + B,x by a non- 
linear function of x, and the second is to use a regression function involving more 
than a single independent variable. After fitting a regression function of the cho- 
sen form to the given data, it is of course important to have methods available 
for making inferences about the parameters of the chosen model. Before these 
methods are used, though, the data analyst should first assess the adequacy of 
the chosen model. In Section 13.1, we discuss methods, based primarily on a 
graphical analysis of the residuals (observed minus predicted y's), for checking 
model adequacy. 

In Section 13.2, we consider nonlinear regression functions of a single 
independent variable x that are “intrinsically linear.” By this we mean that 
it is possible to transform one or both of the variables so that the relation- 
ship between the resulting variables is linear. An alternative class of nonlinear 
relations is obtained by using polynomial regression functions of the form 
My.» = By + Bix + Box? + ++ + B,x*; these polynomial models are the subject 
of Section 13.3. Multiple regression analysis involves building models for 
relating y to two or more independent variables. The focus in Section 13.4 is on 
interpretation of various multiple regression models and on understanding and 
using the regression output from various statistical computer packages. The last 
section of the chapter surveys some extensions and pitfalls of multiple regression 
modeling. 


542 
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13.1 Assessing Model Adequacy 


A plot of the observed pairs (x,, y,) is a necessary first step in deciding on the form of 
a mathematical relationship between x and y. It is possible to fit many functions other 
than a linear one (y = by + b,x) to the data, using either the principle of least squares or 
another fitting method. Once a function of the chosen form has been fitted, it is impor- 
tant to check the fit of the model to see whether it is in fact appropriate. One way to study 
the fit is to superimpose a graph of the best-fit function on the scatterplot of the data. 
However, any tilt or curvature of the best-fit function may obscure some aspects of the 
fit that should be investigated. Furthermore, the scale on the vertical axis may make it 
difficult to assess the extent to which observed values deviate from the best-fit function. 


Residuals and Standardized Residuals 


A more effective approach to assessment of model adequacy is to compute the fitted 
or predicted values }, and the residuals e, = y, — 3, and then plot various functions 
of these computed quantities. We then examine the plots either to confirm our choice 
of model or for indications that the model is not appropriate. Suppose the simple 
linear regression model is correct, and let y = Bo ai Bix be the equation of the 
estimated regression line. Then the ith residual is e; = y,; — (Bo + Bx). To derive 
properties of the residuals, let e, = Y; — A represent the ith residual as a random 
variable (rv) before observations are actually made. Then 


EY, — ¥,) = E(Y) — E(B, + Bx) = By + Bix; — (Bp + Bix) =O (13.1) 


Because y (= Bo + Bix; ;) is a linear function of the Y,’s, so is Y, — y (the coefficients 
depend on the x; 's). Thus the normality of the ¥ N smiplies that each residual is nor- 
mally distributed. It can also be shown that 


(13.2) 


Replacing o7 by s? and taking the square root of Equation (13.2) gives the estimated 
standard deviation of a residual. 

Let’s now standardize each residual by subtracting the mean value (zero) and 
then dividing by the estimated standard deviation. 


The standardized residuals are given by 


ex= Bad. i=1,...,7 (13.3) 


If, for example, a particular standardized residual is 1.5, then the residual itself is 
1.5 (estimated) standard deviations larger than what would be expected from fitting 
the correct model. Notice that the variances of the residuals differ from one another. 
In fact, because there is a — sign in front of (x; — x)’, the variance of a residual 
decreases as x, moves further away from the center of the data x. Intuitively, this is 
because the least squares line is pulled toward an observation whose x; value lies far 
to the right or left of other observations in the sample. Computation of the e*’s can 
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be tedious, but the most widely used statistical computer packages will provide these 
values and construct various plots involving them. 


EXAMPLE 13.1 Exercise 19 in Chapter 12 presented data on x = burner area liberation rate and 
y = NO, emissions. Here we reproduce the data and give the fitted values, residuals, 
and standardized residuals. The estimated regression line is y = —45.55 + 1.71x, 
and r* = .961. The standardized residuals are not a constant multiple of the residuals 
because the residual variances differ somewhat from one another. 


x; Ji Jj e; ex 
100 150 125.6 24.4 1D 
125 140 168.4 —28.4 —.84 
125 180 168.4 11.6 39 
150 210 211.1 =f —.03 
150 190 2111 —21.1 —.62 
200 320 296.7 23.3 .66 
200 280 296.7 —16.7 — 47 
250 400 382.3 17.7 50 
250 430 382.3 47.7 1.35 
300 440 467.9 —27.9 —.80 
300 390 467.9 -—71.9 —2.24 
350 600 553.4 46.6 1.39 
400 610 639.0 —29.0 —.92 
400 670 639.0 31.0 99 

a 


Diagnostic Plots 


The basic plots that many statisticians recommend for an assessment of model 
validity and usefulness are the following: 


1. e* (or e) on the vertical axis versus x on the horizontal axis—that is, a plot of 
the (x,, e*) pairs [or the (x,, e;) pairs] 

2. e* (or e) on the vertical axis versus } on the horizontal axis—that is, a plot of 
the (5,, e*) pairs [or the (j,, e,) pairs] 

3. ¥ on the vertical versus y on the horizontal—that is, a plot of the (y,, 3,) pairs 


4. A normal probability plot of the standardized residuals 


Plots 1 and 2 are called residual plots (against the independent variable and fitted 
values, respectively). Since ) = Bo =p B x is a linear function of x, the general pat- 
tern of points in Plot 2 should be identical to that in Plot 1, though the horizontal 
scales will differ (in multiple regression, there is a Plot 1 for each predictor, and Plot 
2 is a single omnibus picture that combines information from all of those). Provided 
that the chosen model is correct, neither residual plot should exhibit any discernible 
pattern. The residuals should be randomly distributed about 0 according to a normal 
distribution, so all or almost all e*’s should lie between —2 and +2. 

We hope that the fitted model will give predicted y values that are close to 
their observed counterparts. This would manifest itself in Plot 3 by plotted points 
falling close to a 45° line. Thus this plot provides a visual assessment of model 
effectiveness in making predictions. Plot 4 allows the analyst to assess the plausi- 
bility of assuming that the random deviation ¢ in the model equation has a normal 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


EXAMPLE 13.2 
(Example 13.1 
continued) 


13.1 Assessing Model Adequacy 545 
distribution. If the pattern in the plot departs substantially from linearity, then the 
inferential procedures from Chapter 12 based on the ¢,_, distribution should not be 
used as a basis for drawing conclusions. 


Figure 13.1 presents a scatterplot of the data and the four plots just recommended. The 
plot of } versus y confirms the impression given by 7” that x is effective in predicting y 
and also indicates that there is no observed y for which the predicted value is terribly 
far off the mark. Both residual plots show no unusual pattern or discrepant values. 
There is one standardized residual slightly outside the interval (—2, 2), but this is 
not surprising in a sample of size 14. The normal probability plot of the standardized 
residuals is reasonably straight. In summary, the plots leave us with no qualms about 
either the appropriateness of a simple linear relationship or the fit to the given data. 
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Figure 13.1 Plots for the data from Example 13.1 | 
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Difficulties and Remedies 


Although we hope that our analysis will yield plots like those of Figure 13.1, quite 
frequently the plots will suggest one or more of the following difficulties: 


1. A nonlinear probabilistic relationship between x and y is appropriate. 


2. The variance of € (and of Y) is not a constant 07, but instead depends somehow 
on x. 


3. The selected model fits the data well except for a very few discrepant or outlying 
data values, which may have greatly influenced the choice of the best-fit function. 


4. The error variable € does not have a normal distribution. 


5. When the subscript i indicates the time order of the observations, the e,’s 
exhibit dependence over time. 


6. One or more relevant independent variables have been omitted from the model. 


Figure 13.2 presents residual plots corresponding to items 1-3, 5, and 6. In 
Chapter 4, we discussed patterns in normal probability plots that cast doubt on the 
assumption of an underlying normal distribution. Notice that the residuals from the data 
in Figure 13.2(d) with the circled point included would not by themselves necessarily 
suggest further analysis, yet when a new line is fit with that point deleted, the new line 
differs considerably from the original line. This type of behavior is more difficult to 
identify in multiple regression. It is most likely to arise when there is a single (or very 
few) data point(s) with independent variable value(s) far removed from the remainder 


of the data. 
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Figure 13.2 Plots that indicate abnormality in data: (a) nonlinear relationship; (b) nonconstant 
variance; (c) discrepant observation; (d) observation with large influence; (e) dependence in errors; 
(f) variable omitted 
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We now indicate briefly what remedies are available for the types of difficul- 
ties. For a more comprehensive discussion, one or more of the references on regres- 
sion analysis should be consulted. If the residual plot looks something like that of 
Figure 13.2(a), exhibiting a curved pattern, then a nonlinear function of x may be fit. 

The residual plot of Figure 13.2(b) suggests that, although a straight-line 
relationship may be reasonable, the assumption that V(Y,) = o7 for each i is of doubt- 
ful validity. When the assumptions of Chapter 12 are valid, it can be shown that 
among all unbiased estimators of 6) and £,, the ordinary least squares estimators 
have minimum variance. These estimators give equal weight to each (x,, Y,). If the 
variance of Y increases with x, then Y,’s for large x, should be given less weight than 
those with small x,. This suggests that 6, and 6, should be estimated by minimizing 


f,(bq, 6») = Dwyly;, — by + bx) P (13.4) 


where the w,’s are weights that decrease with increasing x;. Minimization of Expression 
(13.4) yields weighted least squares estimates. For example, if the standard deviation 
of Y is proportional to x (forx > 0)—that is, V(Y) = kx?—then it can be shown that 
the weights w, = 1/x? yield best estimators of By and B,. Weighted least squares is 
used quite frequently by econometricians (economists who use statistical methods) to 
estimate parameters. 

When plots or other evidence suggest that the data set contains outliers or points 
having large influence on the resulting fit, one possible approach is to omit these outly- 
ing points and recompute the estimated regression equation. This would certainly be 
correct if it were found that the outliers resulted from errors in recording data values 
or experimental errors. If no assignable cause can be found for the outliers, it is still 
desirable to report the estimated equation both with and without outliers omitted. Yet 
another approach is to retain possible outliers but to use an estimation principle that 
puts relatively less weight on outlying values than does the principle of least squares. 
One such principle is MAD (minimize absolute deviations), which selects Bo and B, 
to minimize |y, — (by + b,x,)|. Unlike the estimates of least squares, there are no 
nice formulas for the MAD estimates; their values must be found by using an iterative 
computational procedure. Such procedures are also used when it is suspected that the 
e;’s have a distribution that is not normal but instead have “heavy tails” (making it 
much more likely than for the normal distribution that discrepant values will enter the 
sample); robust regression procedures are those that produce reliable estimates for a 
wide variety of underlying error distributions. Least squares estimators are not robust 
in the same way that the sample mean X is not a robust estimator for jz. 

When a plot suggests time dependence in the error terms, an appropriate 
analysis may involve a transformation of the y’s or else a model explicitly including 
a time variable. Lastly, a plot such as that of Figure 13.2(f), which shows a pattern 
in the residuals when plotted against an omitted variable, suggests that a multiple 
regression model that includes the previously omitted variable should be considered. 


EXERCISES Section 13.1 (1-14) 


1. Suppose the variables x = commuting distance and y = b. Repeat part (a) for x, = 5, x, = 10,x, = 15, x, = 20, 
comuting time are related according to the simple linear and x; = 50. 
regression model with o = 10. c. What do the results of parts (a) and (b) imply about 
a. Ifn =5 observations are made at the x values x, = 5, the deviation of the estimated line from the observa- 
x, = 10, x; = 15, x, = 20, and x; = 25, calculate the tion made at the largest sampled x value? 


standard deviations of the five corresponding residuals. 
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ice thickness (mm) was studied as a function of elapsed 
time (hr) under specified conditions. The following data 
was read from a graph in the article: n= 33; 
x = 17, .33, 50, .67,..., 5.50; y = .50, 1.25, 1.50, 2.75, 
3.50, 4.75, 5.75, 5.60, 7.00, 8.00, 8.25, 9.50, 10.50, 
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2. The x values and standardized residuals for the chlorine 11.00, 10.75, 12.50, 12.25, 13.25, 15.50, 15.00, 15.25, 
flow/etch rate data of Exercise 52 (Section 12.4) are 16.25, 17.25, 18.00, 18.25, 18.15, 20.25, 19.50, 20.00, 
displayed in the accompanying table. Construct a stan- 20.50, 20.60, 20.50, 19.80. 
dardized residual plot and comment on its appearance. a. The 7 value resulting from a least squares fit is 
x 1.50 1.50 2.00 2.50 2.50 .977. Interpret this value and comment on the 

appropriateness of assuming an approximate linear 
e* 31 1.02 i Is) —1.23 .23 relationship. 
x 3.00 3.50 3.50 4.00 b. The residuals, listed in the same order as the x val- 
ues, are 
e* 73 136 1.53 07 
; : : 1.03 —0.92 —1.35 —0.78 —0.68 —O.11 0.21 
3. Example 12.6 presented the residuals from a simple lin- 
ear ean of moisture content y on filtration ae Xx. ee ee ee Ne ee. ee 
’ é : . —0.14 0.93 0.04 0.36 1.92 0.78 0.35 
a. Plot the residuals against x. Does the resulting plot 067 1.02 1.09 066 —-0.09 1.33 —0.10 
suggest that a straight-line regression function is a 0.24 —0.43 1.01 1.75 -3.14 
reasonable choice of model? Explain your reasoning. 
b. Using s = .665, compute the values of the standard- Plot the residuals against elapsed time. What does the 
ized residuals. Is e* ~ e,’s fori = 1,..., n, or are the plot suggest? 
e*’s not close to being proportional to the e,’s? 6. The accompanying scatterplot is based on data provided 
c. Plot the standardized residuals against x. Does the by authors of the article “Spurious Correlation in the 
plot differ significantly in general appearance from USEPA Rating Curve Method for Estimating Pollutant 
the plot of part (a)? Loads” (J. of Envir. Engr., 2008: 610-618); here dis- 

4. The accompanying data on y = normalized energy (J/m?) charge is in ft*/s as opposed to m?/s used in the article. The 
and x = intraocular pressure (mmHg) appeared in a scat- point on the far right of the plot corresponds to the obser- 
terplot in the article “Evaluating the Risk of Eye Injuries: vation (140, 1529.35). The resulting standardized residual 
Intraocular Pressure During High Speed Projectile is 3.10. Minitab flags the observation with an R for large 
Impacts” (Current Eye Research, 2012: 43-49); an esti- residual and an X for potentially influential observation. 
mated regression function was superimposed on the plot. Here is some information on the estimated slope: 

x | 2761 19764 25713 3980 12782 19008 Full sample (140, 1529.35) deleted 
y 1553 14999 32813 1667 8741 16526 B, 9.9050 8.8241 
x 20782 19028 14397 9606 3905 25731 56 3806 4734 
y I a Does this observation appear to have had a substantial 
a. Here is Minitab output from fitting the simple linear impact on the estimated slope? Explain. 
regression model. Does the model appear to specify 
a useful relationship between the two variables? , 
Predictor Coef SE Coef T P 1600 + Ps 
erennu 12912 0.1347 9.59 0.000 | _——_—«Load =-13.58 + 9.905 Discharge 
S=3679.36 R-Sq=90.2% R-Sq(adj) =89.2% S 1200 
b. The standardized residuals resulting from fitting the > 1000 
simple linear regression model (in the same order as 2 8005 
the observations) are .98,— 1.57, 1.47, .50,—.76,—.84, = 600 | 
1.47,—.85,—1.03,—.20, .40, and .81. Construct a 8 ‘ 
400 — S 69.0107 
plot of e* versus x and comment. [Note: The model 3 R-Sq 92.5% 
fit in the cited article was not linear.] 200 5 Regie 224% 

5. As the air temperature drops, river water becomes super- 0-7 3 
cooled and ice crystals form. Such ice can significantly ; 5 ia an an in ing 1 “i 
affect the hydraulics of a river. The article “Laboratory Discharge (cfs) 

Study of Anchor Ice Growth” (J. of Cold Regions 
Engr., 2001: 60-66) described an experiment in which 7. Composite honeycomb sandwich panels are widely 


used in various aerospace structural applications such 
as ribs, flaps, and rudders. The article “Core Crush 
Problem in Manufacturing of Composite Sandwich 
Structures: Mechanisms and Solutions” (Amer. Inst. 
of Aeronautics and Astronautics J., 2006: 901-907) fit 


a line to the following data on x = prepreg thickness (mm) 
and y = core crush (%): 


x | .246 .250 .251 .251 .254 .262 .264 .270 
y! 160 11.0 15.0 105 135 75 6.1 1.7 
x | .272 277) 281) .289) .290) .292 293 

36 O7 O99 10 O07 3.0 3.1 


Fit the simple linear regression model. What propor- 
tion of the observed variation in core crush can be 
attributed to the model relationship? 

b. Construct a scatterplot. Does the plot suggest that a 
linear probabilistic relationship is appropriate? 

c. Obtain the residuals and standardized residuals, and 
then construct residual plots. What do these plots sug- 
gest? What type of function should provide a better fit 
to the data than does a straight line? 


Continuous recording of heart rate can be used to obtain 
information about the level of exercise intensity or physi- 
cal strain during sports participation, work, or other daily 
activities. The article ““The Relationship Between Heart 
Rate and Oxygen Uptake During Non-Steady State 
Exercise” (Ergonomics, 2000: 1578-1592) reported on a 
study to investigate using heart rate response (x, as a per- 
centage of the maximum rate) to predict oxygen uptake (y, 
as a percentage of maximum uptake) during exercise. The 
accompanying data was read from a graph in the article. 


HR | 43.5 44.0 44.0 44.5 440 45.0 48.0 49.0 
VO, | 22.0 21.0 22.0 21.5 25.5 245 30.0 28.0 
HR | 49.5 51.0 54.5 57.5 57.7 61.0 63.0 72.0 
VO, | 32.0 29.0 38.5 30.5 57.0 40.0 58.0 72.0 


Use a statistical software package to perform a simple lin- 
ear regression analysis, paying particular attention to the 
presence of any unusual or influential observations. 


Consider the following four (x, y) data sets; the first three 
have the same x values, so these values are listed only once 
(Frank Anscombe, “Graphs in Statistical Analysis,” 
Amer. Statistician, 1973: 17-21): 


Data Set 1-3 1 2 3 4 4 
Variable x y y y x y 

10.0 8.04 9.14 7.46 8.0 6.58 

8.0 695 8.14 6.77 8.0 5.76 

13.0 7.58 8.74 12.74 8.0 veal 

9.0 8.81 8.77 7.11 8.0 8.84 

11.0 8.33 9.26 7.81 8.0 8.47 

14.0 9.96 8.10 8.84 8.0 7.04 

6.0 7.24 6.13 6.08 8.0 5.25 

4.0 4.26 3.10 5.39 19.0 12.50 

12.0 10.84 9.13 8.15 8.0 5.56 

7.0 4.82 7.26 6.42 8.0 7.91 

5.0 5.68 4.74 5.73 8.0 6.89 


10. 


11. 


12. 


13. 
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For each of these four data sets, the values of the sum- 
mary statistics Xx, 2x7, Ly, Dy?, and Xx,y, are virtually 
identical, so all quantities computed from these five will be 
essentially identical for the four sets—the least squares line 
(y = 3 + .5x), SSE, s*, 7°, intervals, r statistics, and so 
on. The summary statistics provide no way of distinguish- 
ing among the four data sets. Based on a scatterplot and 
a residual plot for each set, comment on the appropriate- 
ness or inappropriateness of fitting a straight-line model; 
include in your comments any specific suggestions for how 
a “straight-line analysis” might be modified or qualified. 


a. Show that =/_, e, = 0 when the e,’s are the residuals 
from a simple linear regression. 

b. Are the residuals from a simple linear regression 
independent of one another, positively correlated, or 
negatively correlated? Explain. 

c. Show that =/_,x,e; = 0 for the residuals from a simple 
linear regression. (This result along with part (a) shows 
that there are two linear restrictions on the e,’s, resulting 
in a loss of 2 df when the squared residuals are used to 
estimate co.) 

d. Is it true that =?_, e* = 0? Give a proof or a counter 
example. 

a. Express the ithresidual Y, — Y,(where Y, = 8, + B,x;)) 
in the form 2c, a linear function of the Y;’s. Then 
use rules of variance to verify that V(Y, — Y,) is given 
by Expression (13.2). 

b. It can be shown that ¥, and Y; — ig ; (the ith predicted 
value and residual) are independent of one another. 
Use this fact, the relation Y, = Y, + (Y, — Y,), and the 
expression for V(Y) from Section 12.4 to again verify 
Expression (13.2). 

c. As x, moves farther away from x, what happens to 
V(¥,) and to WY, — ¥,)? 

a. Could a linear regression result in residuals 23, —27, 
5, 17, —8, 9, and 15? Why or why not? 

b. Could a linear regression result in residuals 23, —27, 
5, 17, —8, —12, and 2 corresponding to x values 3, —4, 
8, 12, —14, —20, and 25? Why or why not? [Hint: See 
Exercise 10.] 


Recall that Bo “le Bix has a normal distribution with 
expected value B, + B,x and variance 


= (el (x — x)? 
CS 
Si) 


z= By px (By + Byx) 


1 (x-—x)y \1/?2 
o|- += 
no X¥x,- x) 
has a standard normal distribution. If S = \/SSE/(n — 2) 
is substituted for a, the resulting variable has a ¢ distribu- 


tion with n — 2 df. By analogy, what is the distribution 
of any particular standardized residual? If n = 25, what is 


so that 
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14. 


the probability that a particular standardized residual falls 
outside the interval (—2.50, 2.50)? 


If there is at least one x value at which more than one obser- 
vation has been made, there is a formal test procedure for 
testing Hy: by., = Bo + B\x for some values Bo, B, (the 
true regression function is linear) 


versus 


H,: Ho is not true (the true regression function is not 
linear) 


Suppose observations are made at x,,Xx5,...,x,. Let 
Yip» Yig,--+ Yin, denote the mn, observations when 
X= Xy3...5 Yo, Vig.) Yn, denote the n, observations 
when x = x,. With n = Xn; (the total number of observa- 
tions), SSE has n — 2 df. We break SSE into two pieces, 


SSPE (pure error) and SSLF (lack of fit), as follows: 
SSPE = >) > (% — ¥..° 
ij 
~ DP 3 ie Dae 


SSLF = SSE — SSPE 


The n, observations at x; contribute n,; — 1 df to SSPE, 
so the number of degrees of freedom for SSPE is 
=n; — 1) = n — c, and the degrees of freedom for SSLF 
isn — 2 —(n—c) =c — 2. Let MSPE = SSPE/(n — c) 
and MSLF = SSLF/(c — 2). Then it can be shown that 


whereas E(MSPE) = o? whether or not Hy is true, 
E(MSLF) = o? if Hy is true and E(MSLF) > o? if Hy 
is false. 

The test statistic is F = MSLF/MSPE, and the corre- 


sponding P-value is the area under the F’,_,,,_, curve to 
the right of f- 


The following data comes from the article ““Changes 
in Growth Hormone Status Related to Body Weight 
of Growing Cattle’ (Growth, 1977: 241-247), with 
x = body weight and y = metabolic clearance rate/body 
weight. 


x | 110 110 110 = 230 =. 230 230 = 360 


y | 235 198 173, 174 149 124 = =115 


x | 360 360 360 505 505 505. 505 


y | 130 102 95 122 112 98 96 


(Soc = 4, n, = ny = 3,0; =n, = 4.) 


a. Test H, versus H, at level .05 using the lack-of-fit test 
just described. 

b. Does a scatterplot of the data suggest that the rela- 
tionship between x and y is linear? How does this 
compare with the result of part (a)? (A nonlinear 
regression function was used in the article.) 


13.2 Regression with Transformed Variables 


The necessity for an alternative to the linear model Y = By + B,x + € may be sug- 
gested either by a theoretical argument or else by examining diagnostic plots from a 
linear regression analysis. In either case, settling on a model whose parameters can 
be easily estimated is desirable. An important class of such models is specified by 
means of functions that are “intrinsically linear.” 


DEFINITION 


, 


ent variable. 


A function relating y to x is intrinsically linear if, by means of a transforma- 
tion on x and/or y, the function can be expressed as y’ = B, + B,x’, where 
x’ = the transformed independent variable and y’ = the transformed depend- 


Four of the most useful intrinsically linear functions are given in Table 13.1. In each 
case, the appropriate transformation is either a log transformation—either base 10 
or natural logarithm (base e)—or a reciprocal transformation. Representative graphs 
of the four functions appear in Figure 13.3. 

For an exponential function relationship, only y is transformed to achieve 
linearity, whereas for a power function relationship, both x and y are transformed. 
Because the variable x is in the exponent in an exponential relationship, y increases 
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Table 13.1 Useful Intrinsically Linear Functions* 


Function Transformation(s) to Linearize Linear Form 
a. Exponential: y = ae* y’ = Inv) y’ = In(a) + Bx 
b. Power: y = ax? y’ = log(y), x’ = log(x) y’ = log(a) + Bx’ 
c y=at+B- log(x) x’ = log(x) y=at Bx’ 

1 1 
d. Reciprocal: y=a+B- — x= y=a+t Bx’ 

x x 


*When log ( - ) appears, either a base 10 or a base e logarithm can be used. 


(if 8 > 0) or decreases (if 8 <0) much more rapidly as x increases than is the case 
for the power function, though over a short interval of x values it can be difficult to 
differentiate between the two functions. Examples of functions that are not intrinsi- 
cally linear are y = a + ye and y= a +t yx8. 


y y » y 
A 
a 
B>O0 B<0O B>1 
0< B<1 B<0O 
a 
- XxX > XxX > xX > x 
(a) (b) 
y y y y 
A 
Oe ee 
> 
B>O0 B>o0 B<0 
q }----_l 
0 04{\B<9 
> Xx > x > x > x 
(c) (d) 


Figure 13.3 Graphs of the intrinsically linear functions given in Table 13.1 


Intrinsically linear functions lead directly to probabilistic models that, though 
not linear in x as a function, have parameters whose values are easily estimated using 
ordinary least squares. 


DEFINITION A probabilistic model relating Y to x is intrinsically linear if, by means of a 
transformation on Y and/or x, it can be reduced to a linear probabilistic model 
Veo (3 icten (Scaatrer 


The intrinsically linear probabilistic models that correspond to the four functions of 
Table 13.1 are as follows: 


a. Y = ae* - €,a multiplicative exponential model, from which In(Y) = Y’ = By + 
B,x' + €' with x’ = x, By = In(a), B, = B, ande’ = In(e). 


b. Y = ax® - €, a multiplicative power model, so that log(Y) = Y’ = B,+ B,x'+e' 
with x’ = log(x), By = log(x) + e, and e’ = log(e). 
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c. Y=a + B log(x) + €, so that x’ = log(x) immediately linearizes the model. 
d. Y=a+8- 1/x +, so that x’ = 1/x yields a linear model. 


The additive exponential and power models, Y = ae®* + € and Y = ax? + e, are 
not intrinsically linear. Notice that both (a) and (b) require a transformation on Y 
and, as a result, a transformation on the error variable e. In fact, if € has a lognormal 
distribution (see Chapter 4) with E(e) = e”/? and V(e) = 7? independent of x, then 
the transformed models for both (a) and (b) will satisfy all the assumptions of Chap- 
ter 12 regarding the linear probabilistic model; this in turn implies that all inferences 
for the parameters of the transformed model based on these assumptions will be 
valid. If a? is small, wy., ~ ae" in (a) or ax? in (b). 

The major advantage of an intrinsically linear model is that the parameters By, 
and 6, of the transformed model can be immediately estimated using the principle 
of least squares simply by substituting x’ and y’ into the estimating formulas: 


n XY, — Dx) Ly//n 
Sey = Gayla 


AS zy; 7 B =x cal aot 
i=, =F Pe (13.5) 


Parameters of the original nonlinear model can then be estimated by transforming back 
Bo and/or B, if necessary. Once a prediction interval for y’ when x’ = x’* has been cal- 
culated, reversing the transformation gives a PI for y itself. In cases (a) and (b), when 
o” is small, an approximate CI for pry... results from taking antilogs of the limits in the 
Cl for By + B,x'*. (Strictly speaking, taking antilogs gives a CI for the median of the Y 
distribution, i.e., for ty... . Because the lognormal distribution is positively skewed, 
pt > p; the two are approximately equal if o is close to 0.) 


EXAMPLE 13.3 Taylor’s equation for tool life y as a function of cutting time x states that xy° = k or, 
equivalently, that y = ax® (see the Wikipedia entry on Tool wear for more informa- 
tion). The article “The Effect of Experimental Error on the Determination of 
Optimum Metal Cutting Conditions” (J. of Engr. for Industry, 1967: 315-322) 
observes that the relationship is not exact (deterministic) and that the parameters 
a and B must be estimated from data. Thus an appropriate model is the multipli- 
cative power model Y = a - x8 - €, which the author fit to the accompanying data 
consisting of 12 carbide tool life observations (Table 13.2). In addition to the x, y, 
x', and y’ values, the predicted transformed values (}’) and the predicted values on 
the original scale (}, after transforming back) are given. 


The summary statistics for fitting a straight line to the transformed data are Xx/ = 
74.41200, Xy; = 26.22601, 2x;? = 461.75874, Xy/? = 67.74609, and Xx/y/ = 
160.84601, so 
a 160.84601 — (74.41200)(26.22601)/12 
: 461.75874 — (74.41200)?/12 


= —3,3996 


»  26.22601 — (—5.3996)(74.41200) 
Bo = a = 35.6684 


The estimated values of a and B, the parameters of the power function model, 
are B = B, = —5.3996 and a@ = e% = 3.094491530 - 10°. Thus the estimated 
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Table 13.2 Data for Example 13.3 


a y x’ = In(x) y' = In) i’ j=e 

1 600 2233 6.39693 85442 1.12754 3.0881 
2 600 2.65 6.39693 .97456 1.12754 3.0881 
3 600 3.00 6.39693 1.09861 1.12754 3.0881 
4 600 3.60 6.39693 1.28093 1.12754 3.0881 
5 500 6.40 6.21461 1.85630 2.11203 8.2650 
6 500 7.80 6.21461 2.05412 2.11203 8.2650 
7 500 9.80 6.21461 2.28238 2.11203 8.2650 
8 500 16.50 6.21461 2.80336 2.11203 8.2650 
9 400 21.50 5.99146 3.06805 3.31694 27.5760 
10 400 24.50 5.99146 3.19867 3.31694 27.5760 
11 400 26.00 5.99146 3.25810 3.31694 27.5760 
12 400 33.00 5.99146 3.49651 3.31694 27.5760 


regression function is fty., ~ 3.094491530 - 105 - x53. To recapture Taylor’s 
estimated) equation, set y = 3.094491530 - 10! - x 536, whence xy!85 = 740. 

Figure 13.4(a) gives a plot of the standardized residuals from the linear regres- 
sion using transformed variables (for which r? = .922); there is no apparent pattern 
in the plot, though one standardized residual is a bit large, and the residuals look as 
they should for a simple linear regression. Figure 13.4(b) pictures a plot of } versus 
y, which indicates satisfactory predictions on the original scale. 

To obtain a confidence interval for median tool life when cutting time is 500, 
we transform x = 500 to x’ = 6.21461. Then 8, + B,x' = 2.1120, and a 95% CI for 
Bo + B, (6.21461) is (from Section 12.4) 2.1120 + (2.228)(.0824) = (1.928, 2.296). 
The 95% CI for py.599 is then obtained by taking antilogs: (e!°78, e??%) = 
(6.876, 9.930). It is easily checked that for the transformed data s* = G6? ~ .081. 
Because this is quite small, (6.876, 9.930) is an approximate interval for py. s¢o- 


e* $ 
3.04 
e 
2.0 4 
1.0 
= e e 
0.0 5 
z e e 
-1.0- P . ° 
—2.0 7 ; 
T T Lo T T T T ro 
6.0 6.2 6.4 8.0 16.0 24.0 32.0 40.0 
(a) (b) 
Figure 13.4 (a) Standardized residuals versus x’ from Example 13.3; (b) 7 versus y from 
Example 13.3 a 


EXAMPLE 13.4 The accompanying data on x = length of a scamp (mm) and y = mercury content 
(mg/kg) was extracted from a graph in the article ‘“‘Miercury in Groupers and Sea 
Basses from the Gulf of Mexico: Relationships with Size, Age, and Feeding 
Ecology” (Transactions of the Amer. Fisheries Soc., 2012: 1274-1286). 
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x 285 297 334 362 407 438 455 486 512 
y .076 045 .064 .080 .061 O71 131 198 084 


x 520 535 560 573 590 598 612 667 690 
y 247 223 223 278 368 497 281 257 577 


Figure 13.5 displays a scatterplot of the data and a plot of the standardized residuals 
from a linear regression of y on x, both from Minitab. In the latter plot, most points 
on the far left and right are above the zero line whereas most points in the middle 
are below the line. This indication of curvature is more apparent in the residual plot 
than in the scatterplot. 
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Figure 13.5 (a) Scatterplot; (b) residual plot from a linear regression for the data in 
Example 13.4 


Figure 13.6 shows a scatterplot of In(y) versus x and a plot of the standardized 
residuals resulting from a linear regression of In(y) on x. There is no discernible 
pattern in the latter plot other than pure randomness, and a normal probability plot 
of the standardized residuals (not shown) has a very substantial linear pattern. The 
r’ value for this transformed regression is reasonably impressive, and the P-value 
for the model utility test is .000. In fact, the cited article actually contained a plot of 
In(y) versus x; the included regression equation is very close to ours, with 7? = .792 
for the full sample of 49 observations. 


In (merc) = —4.686 + 0.005746 length 
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Figure 13.6 (a) Scatterplot; (b) Residual plot for the transformed data of Example 13.4 
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The estimated untransformed exponential regression function is y = e7 4686+ 005746 — 
.009224¢°05746x, from which a point prediction for a future Y corresponding to any 
particular x can be obtained. A prediction interval for the mercury content of a single 
fish having length x* is obtained by first using the simple linear regression to obtain 
a prediction interval for In(Y) and then taking the antilog of the two endpoints. For 
example, a 95% PI when x = 500 is (e776, e~ 1.0746) = (.074, .359). ia 


In analyzing transformed data, one should keep in mind the following points: 


1. Estimating B, and By as in (13.5) and then transforming back to obtain estimates 
of the original parameters is not equivalent to using the principle of least squares 
directly on the original model. Thus, for the exponential model, we could esti- 
mate a and B by minimizing =(y, — ae"). Iterative computation would be nec- 
essary. In general, @ # eo and B # B.. 

2. If the chosen model is not intrinsically linear, the approach summarized in (13.5) 
cannot be used. Instead, least squares (or some other fitting procedure) would have 
to be applied to the untransformed model. Thus, for the additive exponential model 
Y = aeP* + e, least squares would involve minimizing =(y,; — ae®*)*. Taking par- 
tial derivatives with respect to a and B results in two nonlinear normal equations in 
a and B; these equations must then be solved using an iterative procedure. 


3. When the transformed linear model satisfies all the assumptions described in 
Chapter 12, the method of least squares yields best estimates of the transformed 
parameters. However, estimates of the original parameters may not be best 
in any sense, though they will be reasonable. For example, in the exponential 
model, the estimator @ = eo will not be unbiased, though it will be the maxi- 
mum likelihood estimator of a if the error variable e’ is normally distributed. 
Using least squares directly (without transforming) could yield better estimates. 


4. If a transformation on y has been made and one wishes to use the standard for- 
mulas to test hypotheses or construct Cls, e’ should be at least approximately 
normally distributed. To check this, a normal probability plot of the standard- 
ized residuals from the transformed regression should be examined. 


5. When y is transformed, the r? value from the resulting regression refers to varia- 
tion in the y,’’s, explained by the transformed regression model. Although a high 
value of r? here indicates a good fit of the estimated original nonlinear model to 
the observed y,’s, 7° does not refer to these original observations. Perhaps the best 
way to assess the quality of the fit is to compute the predicted values }/ using the 
transformed model, transform them back to the original y scale to obtain },, and 
then plot j versus y. A good fit is then evidenced by points close to the 45° line. 
One could compute SSE = X(y, — 5,)? as a numerical measure of the goodness of 
fit. When the model was linear, we compared this to SST = X(y, — y)’, the total 
variation about the horizontal line at height y; this led to r?. In the nonlinear case, 
though, it is not necessarily informative to measure total variation in this way, so 
an r? value is not as useful as in the linear case. 


More General Regression Methods 


Thus far we have assumed that either Y = f(x) + € (an additive model) or that 
Y = f(x) - € (a multiplicative model). In the case of an additive model, py., = f(x), 
so estimating the regression function f(x) amounts to estimating the curve of 
mean y values. On occasion, a scatterplot of the data suggests that there is no sim- 
ple mathematical expression for f(x). Statisticians have recently developed some 
more flexible methods that permit a wide variety of patterns to be modeled using 
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the same fitting procedure. One such method is LOWESS (or LOESS), short for 
locally weighted scatterplot smoother. Let (x*, y*) denote a particular one of the 
n (x, y) pairs in the sample. The } value corresponding to (x*, y*) is obtained by 
fitting a straight line using only a specified percentage of the data (e.g., 25%) 
whose x values are closest to x*. Furthermore, rather than use “ordinary” least 
squares, which gives equal weight to all points, those with x values closer to x* 
are more heavily weighted than those whose x values are farther away. The height 
of the resulting line above x* is the fitted value y*. This process is repeated for 
each of the n points, so n different lines are fit (you surely wouldn’t want to do 
all this by hand). Finally, the fitted points are connected to produce a LOWESS 
curve. 


EXAMPLE 13.5 Weighing large deceased animals found in wilderness areas is usually not feasible, 
so it is desirable to have a method for estimating weight from various characteristics 
of an animal that can be easily determined. Minitab has a stored data set consisting 
of various characteristics for a sample of n = 143 wild bears. Figure 13.7(a) displays 
a scatterplot of y = weight versus x = distance around the chest (chest girth). At 
first glance, it looks as though a single line obtained from ordinary least squares 
would effectively summarize the pattern. Figure 13.7(b) shows the LOWESS curve 
produced by Minitab using a span of 50% [the fit at (x*, y*) is determined by the 
closest 50% of the sample]. The curve appears to consist of two straight line seg- 
ments joined together above approximately x = 38. The steeper line is to the right 
of 38, indicating that weight tends to increase more rapidly as girth does for girths 
exceeding 38 in. 
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Figure 13.7 (a) A Minitab scatterplot for the bear weight data 
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Figure 13.7 (b) A Minitab LOWESS curve for the bear weight data |_| 


It is complicated to make other inferences (e.g., obtain a CI for a mean y value) 
based on this general type of regression model. The bootstrap technique mentioned 
earlier can be used for this purpose. 


Logistic Regression 


The simple linear regression model is appropriate for relating a quantitative response var- 
iable to a quantitative predictor x. Consider now a dichotomous response variable with 
possible values 1 and 0 corresponding to success and failure. Let p = P(S) = P(Y = 1). 
Frequently, the value of p will depend on the value of some quantitative variable x. For 
example, the probability that a car needs warranty service of a certain kind might well 
depend on the car’s mileage, or the probability of avoiding an infection of a certain type 
might depend on the dosage in an inoculation. Instead of using just the symbol p for the 
success probability, we now use p(x) to emphasize the dependence of this probability 
on the value of x. The simple linear regression equation Y = By + 6.x + € is no longer 
appropriate, for taking the mean value on each side of the equation gives 


Hy, = 1+ p(x) + 0- (1 — p@) = p@) = 8, + Bx 


Whereas p(x) is a probability and therefore must be between 0 and 1, 6) + B,x need 
not be in this range. 

Instead of letting the mean value of Y be a linear function of x, we now consider 
a model in which some function of the mean value of Y is a linear function of x. In 
other words, we allow p(x) to be a function of 6) + 6,x rather than 6B, + B,x itself. 
A function that has been found quite useful in many applications is the logit function 


Figure 13.8 shows a graph of p(x) for particular values of By and B, with B, > 0. As 
x increases, the probability of success increases. For B, negative, the success prob- 
ability would be a decreasing function of x. 
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P(x) 


1.0 
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Figure 13.8 A graph of a logit function 


Logistic regression means assuming that p(x) is related to x by the logit func- 
tion. Straightforward algebra shows that 


D(x) = ePo + Bix 
1 — p(x) 
The expression on the left-hand side is called the odds. If, for example, T= pe) =, 
—?p 


then when x = 60 a success is three times as likely as a failure. We now see that the 
logarithm of the odds is a linear function of the predictor. In particular, the slope 
parameter 6, is the change in the log odds associated with a one-unit increase in x. 
This implies that the odds itself changes by the multiplicative factor e*: when x 
increases by | unit. 

Fitting the logistic regression to sample data requires that the parameters B, 
and , be estimated. This is usually done using the maximum likelihood technique 
described in Chapter 6. The details are quite involved, but fortunately the most popu- 
lar statistical computer packages will do this on request and provide quantitative and 
pictorial indications of how well the model fits. 


EXAMPLE 13.6 Here is data, in the form of a comparative stem-and-leaf display, on launch tempera- 
ture and the incidence of failure of O-rings in 23 space shuttle launches prior to the 
Challenger disaster of 1986 (Y = yes, failed; N = no, did not fail). Observations on 
the left side of the display tend to be smaller than those on the right side. 


Y N 
873] 5 
3) 6 |677789 Stem: Tens digit 
500) 7 |002356689 Leaf : Ones digit 
8 Il 


Figure 13.9 shows Minitab output for a logistic regression analysis and a graph of 
the estimated logit function from the R software. We have chosen to let p denote the 
probability of failure. The graph of p decreases as temperature increases because 
failures tended to occur at lower temperatures than did successes. The estimate of B, 
and its estimated standard deviation are B, = —.232 and sg = .1082, respectively. 
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We assume that the sample size n is large enough here so that B, has approximately a 
normal distribution. If 8, = 0 (i.e., temperature does not affect the likelihood of O-ring 
failure), the test statistic Z = B, /sg has approximately a standard normal distribution. 
The reported value of this ratio is z = —2.14, with a corresponding two-tailed P-value of 
.032 (some packages report a chi-square value which is just z’, with the same P-value). 
At significance level .05, we reject the null hypothesis of no temperature effect. 


Binary Logistic Regression: failure versus temp 


Logistic Regression Table 
Odds 95% CI 


Predictor Coef SE Coef Z P Ratio Lower Upper 
Constant 15.0429 7.37862 2.04 0.041 

temp —0.232163 0.108236 -—-2.14 0.032 0.79 0.64 0.98 
Goodness-of-Fit Tests 

Method Chi-Square DF P 

Pearson 11.1303 14 0.676 

Deviance 11.9974 14 0.607 

Hosmer -Lemeshow Oe PALES 8 0.286 


Classification Summary y 
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Figure 13.9 Logistic regression output from Minitab for Example 13.6, and graph of estimated 
logistic function and classification probabilities from R 


The estimated odds of failure for any particular temperature value x is 


POX) = 15.0429 —.232163x 
1 — p(x) 
This implies that the odds ratio—the odds of failure at a temperature of x + 1 di- 
vided by the odds of failure at a temperature of x—is 


p(x + 1)/t1 — p&+1)) 07232163 = 
P(x)/[1 — p(x)] 


The interpretation is that for each additional degree of temperature, we estimate that the 
odds of failure will decrease by a factor of .79 (21%). A 95% CI for the true odds ratio 
also appears on output. In addition, Minitab provides three different ways of assessing 
model lack-of-fit: the Pearson, deviance, and Hosmer-Lemeshow tests. Large P-values 
are consistent with a good model. These tests are useful in multiple logistic regression, 
where there is more than one predictor in the model relationship so there is no single 
graph like that of Figure 13.9(b). Various diagnostic plots are also available. 


7928 
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The R output provides information based on classifying an observation as 
a failure if the estimated p(x) is at least .5 and as a non-failure otherwise. Since 
p(x) = .5 when x = 64.80, three of the seven failures (Ys in the graph) would 
be misclassified as non-failures (a misclassification proportion of .429), whereas 
none of the non-failure observations would be misclassified. A better way to assess 
the likelihood of misclassification is to use cross-validation: Remove the first obser- 
vation from the sample, estimate the relationship, then classify the first observation 
based on this estimated relationship, and repeat this process with each of the other 
sample observations (so a sample observation does not affect its own classification). 

The launch temperature for the Challenger mission was only 31°F. This 
temperature is much smaller than any value in the sample, so it is dangerous to 
extrapolate the estimated relationship. Nevertheless, it appears that O-ring failure is 
virtually a sure thing for a temperature this small. a 


EXERCISES Section 13.2 (15-25) 


15. 


No tortilla chip aficionado likes soggy chips, so it is impor- 
tant to find characteristics of the production process that 
produce chips with an appealing texture. The following data 
on x = frying time (sec) and y = moisture content (%) 
appeared in the article “Thermal and Physical Properties 
of Tortilla Chips as a Function of Frying Time” (J. of 
Food Processing and Preservation, 1995: 175-189). 


A linear regression of log(time) versus load was fit. The 
investigators were particularly interested in estimating 
the slope of the true regression line relating these vari- 
ables. Investigate the quality of the fit, estimate the slope, 
and predict time to failure when load is 80, in a way that 
conveys information about reliability and precision. 


17. The following data on mass rate of burning x and flame 
x | 5 10 15 20 25 30 45 60 length y is representative of that which appeared in the 
article “Some Burning Characteristics of Filter Paper” 
y | 168 ae Bel ee ks OE. (Combustion Picea Technology, 1971: ee: 
a. Construct a scatterplot of y versus x and comment. 
b. Construct a scatterplot of the (In(x), In(y)) pairs and x[ 17 22 23026 27 3.0 3.2 
comment. y113 18 16 20 21 22 3.0 
c. What probabilistic relationship between x and y is 
suggested by the linear pattern in the plot of part (b)? #33 41 43-46 5.7 6.1 
d. Predict the value of moisture content when frying y|26 41 3.7 50 58 5.3 
time is 20, in a way that conveys information about a. Estimate the parameters of a power function model. 
reliability and precision. b. Construct diagnostic plots to check whether a power 
e. Analyze the residuals from fitting the simple linear function is an appropriate model choice. 
regression model to the transformed data and comment. c. Test Hy: B = 4/3 versus H,: B < 4/3, using a level 
16. Polyester fiber ropes are increasingly being used as com- .O5 test. 
ponents of mooring lines for offshore structures in deep d. Test the null hypothesis that states that the median 
water. The authors of the paper “‘Quantifying the flame length when burning rate is 5.0 is twice the 
Residual Creep Life of Polyester Mooring Ropes” (Inil. median flame length when burning rate is 2.5 against 
J. of Offshore and Polar Exploration, 2005: 223-228) the alternative that this is not the case. 
used the accompanying data as a basis for studying how 18. Failures in aircraft gas turbine engines due to high 


time to failure (hr) depended on load (% of breaking load): 


cycle fatigue is a pervasive problem. The article 


x ITT 77.8 717.9 77.8 85.5 85.5 “Effect of Crystal Orientation on Fatigue Failure 
y | 5.067 552.056 127.809 7.611 .124  .077 of Single Crystal Nickel Base Turbine Blade 
. Superalloys” (J. of Engineering for Gas Turbines 
x | 89.2 89.3 7A 85.5 89.2 85.5 and Power, 2002: 161-176) gave the accompanying 
y | .008 013 49.439 503 .362 9.930 data and fit a nonlinear regression model in order to 

predict strain amplitude from cycles to failure. Fit an 
x | 89.2 85.5 89.2 82.3 82.0 82.3 appropriate model, investigate the quality of the fit, and 
y | 677 5.322 .289 53.079 7.625 155.299 predict amplitude when cycles to failure = 5000. 
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20. 


21. 


Thermal endurance tests were performed to study the 
relationship between temperature and lifetime of poly- 
ester enameled wire (“Thermal Endurance of 
Polyester Enameled Wires Using Twisted Wire 
Specimens,” IEEE Trans. Insulation, 1965: 38-44), 
resulting in the following data. 


200 200 200 
5933 5404 4947 


200 
4963 


200 
3358 


200 
3878 


Temp. 


Lifetime 
220 
1561 


220 
1494 


220 220 


768 


220 
609 


Temp. 220 


Lifetime 747 777 
240 


258 


240 
299 


240 240 


144 


240 
180 


Temp. 240 


Lifetime 209 184 


a. Does a scatterplot of the data suggest a linear probabi- 
listic relationship between lifetime and temperature? 

b. What model is implied by a linear relationship 
between expected In(lifetime) and 1/temperature? 
Does a scatterplot of the transformed data appear 
consistent with this relationship? 

c. Estimate the parameters of the model suggested in 
part (b). What lifetime would you predict for a tem- 
perature of 220? 

d. Because there are multiple observations at each 
x value, the method in Exercise 14 can be used to test 
the null hypothesis that states that the model sugges- 
ted in part (b) is correct. Carry out the test at level .01. 


Exercise 14 presented data on body weight x and meta- 
bolic clearance rate/body weight y. Consider the following 
intrinsically linear functions for specifying the relationship 
between the two variables: (a) In(y) versus x, (b) In(y) ver- 
sus In(x), (c) y versus In(x), (d) y versus 1/x, and (e) In(y) 
versus 1/x. Use any appropriate diagnostic plots and analy- 
ses to decide which of these functions you would select to 
specify a probabilistic model. Explain your reasoning. 


Mineral mining is one of the most important economic 
activities in Chile. Mineral products are frequently found 
in saline systems composed largely of natural nitrates. 
Freshwater is often used as a leaching agent for the 
extraction of nitrate, but the Chilean mining regions have 
scarce freshwater resources. An alternative leaching agent 


22. 


23. 


24. 
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is seawater. The authors of “‘Recovery of Nitrates from 
Leaching Solutions Using Seawater” (Hydrometallurgy, 
2013: 100-105) evaluated the recovery of nitrate ions 
from discarded salts using freshwater and seawater leach- 
ing agents. Here is data on x = leaching time (h) and y = 
nitrate extraction percentage (seawater): 


x 25:5 31.5 37.5 43.5 49.5 55:5 
y 26.4 40.1 50.2 57.4 62.7 67.3 
x 61.5 67.5 735 79.5 85.5 91.5 
y 71.4 74.7 77.8 80.3 82.3 84.1 
x 97.55 103.55 109.5 115.5 121.5 127.5 
y 85.5 86.6 87.9 89.0 89.9 90.6 
x | 1335 139:'5 145.5 151.5 a57:5 

y 91.2 91.8 92.3 92.8 93.3 

a. Construct a scatterplot. If the simple linear regres- 


sion model were fit to this data, what would a plot of 
the (x, e*) pairs look like? 

b. Construct a scatterplot of y versus x’ = I/x and 
speculate on the value of 7° after fitting the simple 
linear regression model to the transformed data. 

c. Obtain a 95% prediction interval for nitrogen extrac- 
tion percentage when leaching time = 100 h. 


In each of the following cases, decide whether the given 
function is intrinsically linear. If so, identify x’ and y’, 
and then explain how a random error term € can be intro- 
duced to yield an intrinsically linear probabilistic model. 
a. y=1/(a + Bx) 

b. y= 1/0 + e&*F) 

c. y = e*' (a Gompertz curve) 

d. y=a+t Be 

Suppose x and y are related according to a probabilistic 
exponential model Y = ae*® - €, with V(e) a constant 
independent of x (as was the case in the simple linear 
model Y = B, + B,x + €). Is V(Y) a constant indepen- 
dent of x [as was the case for Y = By + B,x + €, where 
V(Y) = o7]? Explain your reasoning. Draw a picture of a 
prototype scatterplot resulting from this model. Answer 
the same questions for the power model Y = ax? - €. 


Kyphosis refers to severe forward flexion of the spine 
following corrective spinal surgery. A study carried out 
to determine risk factors for kyphosis reported the 
accompanying ages (months) for 40 subjects at the time 
of the operation; the first 18 subjects did have kyphosis 
and the remaining 22 did not. 


Kyphosis 12 15 42 52 59 73 
82 91 96 105 114 120 
121) 128 130) 6139) 139 157 


No kyphosis 1 1 2 8 11 18 
22 31 37 61 72 81 
97 112 118 127 131 140 
151 159 177 206 


Use the Minitab logistic regression output on the next 
page to decide whether age appears to have a significant 
impact on the presence of kyphosis. 
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Logistic regression table for Exercise 24 


95% Cl 
Predictor Coef StDev Z P Odds Ratio Lower Upper 
Constant =0.5727 0.6024 =—0.95 0.342 
age 0.004296 0.005849 0.73 0.463 1.00 0.99 1.02 
25. Thearticle “Acceptable Noise Levels for Construction particular noise level (dBA) to which he/she had been 
Site Offices” (Building Serv. Engr. Res. Tech., 2009: exposed was acceptable or unacceptable. Here is data 
87-94) analyzed responses from a sample of 77 indi- provided by the article’s authors: 
viduals, each of whom was asked to say whether a 
Acceptable: 
55.3 55.3 55.3 55.9 55.9 55.9 55.9 56.1 56.1 56.1 56.1 
56.1 56.1 56.8 56.8 57.0 57.0 57.0 57.8 57.8 57.8 57.9 
57.9 57.9 58.8 58.8 58.8 59.8 59.8 59.8 62.2 62.2 65.3 
65.3 65.3 65.3 68.7 69.0 73.0 73.0 
Unacceptable: 
63.8 63.8 63.8 63.9 63.9 63.9 64.7 64.7 64.7 65.1 65.1 
65.1 67.4 67.4 67.4 67.4 68.7 68.7 68.7 70.4 70.4 71.2 
71.2 73.1 73.1 74.6 74.6 74.6 74.6 79.3 79.3 79.3 79.3 
79.53. 83.0 83:0 83:00 
Interpret the accompanying Minitab logistic regression output, and sketch a graph of the 
estimated probability of a noise level being acceptable as a function of the level. 
Logistic regression table for Exercise 25 
95% CI 
Predictor Coef SE Coef Z P Odds Ratio Lower Upper 
Constant 23.2124 5.05095 4.60 0.000 
noise level —0.359441 0.0785031 —4.58 0.000 0.70 0.60 0.81 


13.3. Polynomial Regression 


The nonlinear yet intrinsically linear models of Section 13.2 involved functions of the 
independent variable x that were either strictly increasing or strictly decreasing. In many 
situations, either theoretical reasoning or else a scatterplot of the data suggests that the true 
regression function jy., has one or more peaks or valleys—that is, at least one relative 
minimum or maximum. In such cases, a polynomial function y = B) +B ,x + ++: + B,x* 
may provide a satisfactory approximation to the true regression function. 


DEFINITION The kth-degree polynomial regression model equation is 
Y= By + Bix + Box? + -- + Byxt +e (13.6) 
where € is a normally distributed random variable with 


B.=0 =e (13.7) 


From (13.6) and (13.7), it follows immediately that 
bye By Beto + Be OF. =O (13.8) 


In words, the expected value of Y is a kth-degree polynomial function of x, whereas 
the variance of Y, which controls the spread of observed values about the regression 
function, is the same for each value of x. The observed pairs (x,, y,), -.., (X,, y,,) are 
assumed to have been generated independently from the model (13.6). Figure 13.10 
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(a) (b) 


Figure 13.10 (a) Quadratic regression model; (b) cubic regression model 


illustrates both a quadratic and cubic model; very rarely in practice is it necessary 
to go beyond k = 3. 


Estimating Parameters 


To estimate the B’s, consider a trial regression function y = by + bx + +++ + byxt. 
Then the goodness of fit of this function to the observed data can be assessed by 
computing the sum of squared deviations 


f(b, 6), .--, b) = >= (i= (hy Pe Bae a bye om bee? (13.9) 
i=1 

According to the principle of least squares, the estimates Bos B, sg B; are those 
values of bo, b,,..., b, that minimize Expression (13.9). It should be noted that when 
X4,X5,...,X, are all different, there is a polynomial of degree n — | that fits the data 
perfectly, so that the minimizing value of (13.9) is 0 when k = n — 1. However, in 
virtually all applications, the polynomial model (13.6) with large k is quite unrealistic. 
To find the minimizing values in (13.9), take the & + 1 partial derivatives 

df/ dbo, Of/0b,, ..., Of/ db, and equate them to 0. This gives a system of normal equa- 
tions for the estimates. Because the trial function by + b,x + --- + b,x* is linear in 
bo, .-., 0, (though not in x), the k + 1 normal equations are linear in these unknowns: 


bon + by2x, + bydx? + +++ + bXxk = Zy, 

box; + by=x? + bsdx} + +++ + bx! = Ixy, 
: : : (13.10) 

bymxk + by Dxktl + os + bx?! = Loxky, 


All standard statistical computer packages will automatically solve the equations in 
(13.10) and provide the estimates as well as much other information.* 


EXAMPLE 13.7 Thearticle “Residual Stresses and Adhesion of Thermal Spray Coatings” (Surface 
Engineering, 2005: 35-40) considered the relationship between the thickness (wm) 
of NiCrAl coatings deposited on stainless steel substrate and corresponding bond 
strength (MPa). The following data was read from a plot in the paper: 


Thickness | 220 220 220 220 370 = 370 370 370 440 440 


Strength | 24.0 22.0 19.1 155 263 246 23.1 21.2 25.2 24.0 


Thickness | 440 440 680 680 680 680 860 860 860 860 
Strength | 21.7 19.2 17.0 149 130 118 12.2 112 66 2.8 


* We will see in Section 13.4 that polynomial regression is a special case of multiple regression, so a 
command appropriate for this latter task is generally used. 
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The scatterplot in Figure 13.11(a) supports the choice of the quadratic regres- 
sion model. Figure 13.11(b) contains Minitab output from a fit of this model. The 
estimated regression coefficients are 


A A 


By = 14.521 B, = .04323 B, = —-00006001 
from which the estimated regression function is 
y = 14.521 + .04323x — .00006001.x? 


Substitution of the successive x values 220, 220,...,860, and 860 into this 
function gives the predicted values }, = 21.128, ..., }) = 7.321, and the residuals 
y, — Vy) = 2.872, ..., ¥x9 — Jog = —4.521 result from subtraction. Figure 13.12 
shows a plot of the standardized residuals versus j and also a normal probability plot 
of the standardized residuals, both of which validate the quadratic model. 


Strength 


0 200 400 600 800 1000 
Thickness 


The regression equation is 
strength =14.5+ 0.0432 thickness — 0.000060 thicksqd 


Predictor Coef SE Coef T Pp 
Constant 14.521 4.754 3.05 0.007 
thickness 0.04323 0.01981 2.18 0.043 
thicksqd —0.00006001 0.00001786 = 3:36 0.004 
S.= 3.26937 R-Sq = 78.0% R-Sq(adj) = 75.4% 

Analysis of Variance 

Source DF Ss MS F P 
Regression 2 643.29 321.65 30.09 0.000 
Residual Error 17 181.71 10.69 

Total 19 825.00 


Predicted Values for New Observations 


New 
Obs Fit SE Fit 95% CI 95% PI 
a 21.. 136 1.167 (18.674, 23.598) (13.812, 28.460) 
2 10.704 1,189 ( 8.295, 13.212) ( 3.364, 18.043) 
Values of Predictors for New Observations 
New 
Obs thickness thicksqd 
ue 500 250000 
2 800 640000 


Figure 13.11 Scatterplot of data from Example 13.7 and Minitab output from fit of 
quadratic model 
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Normal Probability Plot of the Residuals 
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Residuals Versus the Fitted Values 


99 z 2 Z 
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5 = e ee 
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Ay ‘| e ‘“ 
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Standardized Residual Fitted Value 
Figure 13.12 Diagnostic plots for quadratic model fit to data of Example 13.7 | 


EXAMPLE 13.8 
(Example 13.7 
continued) 


6 and R* 

To make further inferences, the error variance o% must be estimated. With 
J, = Bo + Bix; + «+: + B,x*, the ith residual is y, — },, and the sum of squared residu- 
als (error sum of squares) is SSE = X(y, — 5,)?. The estimate of o? is then 


SSE 
n—(k+ 1) 


where the denominator n — (k + 1) is used because k + | df are lost in estimating 
Aabncste 

If we again let SST = X(y, — y)?, then SSE/SST is the proportion of the total 
variation in the observed y,’s that is not explained by the polynomial model. The 
quantity | — SSE/SST, the proportion of variation explained by the model, is called 
the coefficient of multiple determination and is denoted by R?. 

Consider fitting a cubic model to the data in Example 13.7. Because this 
model includes the quadratic as a special case, the fit will be at least as good as 
the fit to a quadratic. More generally, with SSE, = the error sum of squares from a 
kth-degree polynomial, SSE, = SSE, and Rj, = Rz whenever k’ > k. Because the 
objective of regression analysis is to find a model that is both simple (relatively few 
parameters) and provides a good fit to the data, a higher-degree polynomial may not 
specify a better model than a lower-degree model despite its higher R? value. To 
balance the cost of using more parameters against the gain in R*, many statisticians 
use the adjusted coefficient of multiple determination 


SSE (n—1)R?—-k 
SST n-1—k 
Adjusted R? adjusts the proportion of unexplained variation upward [since the ratio 


(n — 1)/(n — k — 1) exceeds 1], which results in adjusted R* < R?. For example, if 
RZ = .66, R3 = .70, and n = 10, then 


o so MSE 


(13.11) 


n-1 
n—(k +1) 


adjusted R* = 1 (13.12) 


9(.70) — 3 


adjusted R3 = 
“ 10-4 


adjusted R} = = .550 
Thus the small gain in R? in going from a quadratic to a cubic model is not enough 


to offset the cost of adding an extra parameter to the model. 


SSE and SST are typically found on computer output in an ANOVA table. Figure 
13.11(b) gives SSE = 181.71 and SST = 825.00 for the bond strength data, 
from which R? = 1 — 181.71/825.00 = .780 (alternatively, R? = SSR/SST = 
643.29/825.00 = .780). Thus 78.0% of the observed variation in bond strength can 
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be attributed to the model relationship. Adjusted R? = .754, only a small downward 
change in R?. The estimates of 0? and o are 


SSE 181.71 
G2=s2= = = 10.69 
n—-(k+1) 20-(2+1) 


6 =s5 =3.27 (2 


Besides computing R? and adjusted R?, one should examine the usual diagnostic 
plots to determine whether model assumptions are valid or whether modification may 
be appropriate (see Figure 13.12). There is also a formal test of model utility, an F test 
based on the ANOVA sums of squares. Since polynomial regression is a special case 
of multiple regression, we defer discussion of this test to the next section. 


Statistical Intervals and Test Procedures 


Because the y,’s appear in the normal equations (13.10) only on the right-hand side 
and in a linear fashion, the resulting estimates Bos iy 8B , are themselves linear func- 
tions of the y,’s. Thus the estimators are linear funictions of the Y,’s, so each B; has a 
normal distribution. It can also be shown that each B ; is an unbiased estimator of B,. 

Let og denote the standard deviation of the estimator B ; This standard devia- 
tion has the form 


i 


a complicated expression involving all 
Ke Ske Buses and xis 

Fortunately, the expression in braces has been programmed into all of the most fre- 
quently used statistical software packages. The estimated standard deviation of B; 
results from substituting s in place of o in the expression for og. These estimated 
standard deviations sg, sg,..., and sg appear in output from all the aforementioned 
statistical packages. Let Sp fans the estimator of o%—1that is, the random variable 
whose observed value is sg. Then it can be shown that the standardized variable 


T= (13.13) 


has a f distribution based on n — (k + 1) df. This leads to the following inferential 
procedures. 


A 10001 — a)% CI for B;, the coefficient of x’ in the polynomial regression 
function, is 
B; = tojon—certy * 86, 

A test of Hy: B; = Bj. is based on the ¢ statistic value 
B; — Bio 

By 
The test is based on n — (k + 1) df and is upper-, lower-, or two-tailed accord- 
ing to whether the inequality in H, is >,<, or 4. 


t= 


A vee estimate of wy, .—that is, of By + Bx +--+ Byxt-is py., = = By + 
Bix apart B ,x*. The estimated standard deviation of the corresponding estimator 
is rather complicated. Many computer packages will give this estimated standard 
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deviation for any x value upon request. This, along with an appropriate standardized t 
variable, can be used to justify the following procedures. 


Let x* denote a specified value of x. A 100(1 — a@)% CI for pry. is 
estimated SD | 
py B58 


With Y = Bo ay B (pe? ae 288 Se Bx), y denoting the calculated value of Y for 
the given data, and s» denoting the estimated standard deviation of the 


By. x BE bey /2n—(k+1) 5 | 


statistic Y, the formula for the CI is much like the one in the case of simple 
linear regression: 


0 = bey /2,n—(k+1) “Sy 
A 100(1 — a)% PI for a future y value to be observed when x = x* is 


a 5 _, (estimated SD\? or ae as : 
Mynx = lajran—-@ty * YS + ye == byjan—-catiy ° VST S¥ 
ieee 


EXAMPLE 13.9 Figure 13.11(b) shows that B, = —.00006001 and sg = .00001786 (from the SE 
(Example 13.8 | Coef column at the top of the output). The null hypothesis Hp: 8B, = 0 says that 
continued) as long as the linear predictor x is retained in the model, the quadratic predictor x? 
provides no additional useful information. The relevant alternative is H,: B, # 0, 
and the test statistic is T = B,/ Sa, with computed value —3.36. The test is based 
on n — (k + 1) = 17 df. At significance level .05, the null hypothesis is rejected 
because the reported P-value is .004 (double the area under the ¢,, curve to the left of 
—3.36). Thus inclusion of the quadratic predictor in the model equation is justified. 
The output in Figure 13.11(b) also contains estimation and prediction informa- 

tion both for x = 500 and for x = 800. In particular, for x = 500, 


5 = By + B,(500) + B,(500)? = Fit = 21.136 
sy = estimated SD of } = SE Fit = 1.167 


from which a 95% CI for mean strength when thickness = 500 is 21.136 + (2.110) X 
(1.167) = (18.67, 23.60). A 95% PI for the strength resulting from a single bond when 
thickness = 500 is 21.136 + (2.110)[(3.27)2 + (1.167)]'/2 = (13.81, 28.46). As be- 
fore, the PI is substantially wider than the CI because s is large compared to SE Fit. & 


Centering x Values 


For the quadratic model with regression function p1y.. = By + Byx + B,x?, the param- 
eters By, 6,, and 6, characterize the behavior of the function near x = 0. For exam- 
ple, 6, is the height at which the regression function crosses the vertical axis x = 0, 
whereas , is the first derivative of the function at x = O (instantaneous rate of change 
Of py., at x = 0). If the x,’s all lie far from 0, we may not have precise information 
about the values of these parameters. Let x = the average of the x,’s for which obser- 
vations are to be taken, and consider the model 


Y= —F + Bie 2) Bi@—2zy Pe (13.14) 
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In the model (13.14), wy, = BX + B*(x — x) + BS(x — x)’, and the parameters now 
describe the behavior of the regression function near the center x of the data. 

To estimate the parameters of (13.14), we simply subtract x from each x; to 
obtain x; = x, — x and then use the x/’s in place of the x,’s. An important benefit 
of this is that the coefficients of bo,..., b, in the normal equations (13.10) will be of 
much smaller magnitude than would be the case were the original x,’s used. When 
the system is solved by computer, this centering protects against any round-off error 
that may result. 


EXAMPLE 13.10 The article “A Method for Improving the Accuracy of Polynomial Regression 
Analysis” (J. of Quality Tech., 1971: 149-155) reports the following data on x = 
cure temperature (°F) and y = ultimate shear strength of a rubber compound (psi), 
with x = 297.13: 


x 280 284 292 295 298 305 308 315 
x =17,13 =13.13 =5.13 =2.13 .87 7.87 10.87 17.87 
y 7710 800 840 810 735 640 590 560 


A computer analysis yielded the results shown in Table 13.3. 


Table 13.3 Estimated Coefficients and Standard Deviations for Example 13.10 


Parameter Estimate Estimated SD Parameter Estimate Estimated SD 


By —26,219.64 11,912.78 Bi 759.36 23.20 
B, 189.21 80.25 Bi -7.61 1.43 
Bo — 3312 1350 Bs —,3312 1350 


The estimated regression function using the original model is y = —26,219.64 + 
189.21x — .3312x, whereas for the centered model the function is y = 759.36 — 
7.61(x — 297.13) — .3312(x — 297.13). These estimated functions are identical; the 
only difference is that different parameters have been estimated for the two models. The 
estimated standard deviations indicate clearly that 6;* and 67 have been more accurately 
estimated than 8, and B,. The quadratic parameters are identical (8, = B3), as can be 
seen by comparing the x* term in (13.14) with the original model. We emphasize again 
that a major benefit of centering is the gain in computational accuracy, not only in quad- 
ratic but also in higher-degree models. | 


The book by Neter et al., listed in the chapter bibliography, is a good source 
for more information about polynomial regression. 


EXERCISES Section 13.3 (26—35) 


26. The article “Physical Properties of Cumin Seed” (J. of Data from a graph in the article follows, along with 
Agric. Engr. Res., 1996: 93-98) considered a quadratic Minitab output from the quadratic fit. 
regression of y = bulk density on x = moisture content. 
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The regression equation is 


bulkdens = 403 +16.2 moiscont —0.706 contsqd 
Predictor Coef StDev T P 
Constant 403.24 36.45 11.06 0.002 
moiscont 16.164 5.451 2397 0.059 
contsqd —0.7063 0.1852 =3:.:81 0.032 
Se=LO.L5 R-Sq = 93.8% R-Sq(adj) = 89.6% 


Analysis of Variance 


Source DF Ss MS F P 
Regression 2 4637.7 2318.9 22.51 0.016 
Residual Error 3 309.1 103.0 
Total 5 4946.8 
StDev 

st 
Obs moiscont bulkdens Fit Fit Residual Resid 
1. ye 0) 479.00 481.78 9.35 S278 =0:..770) 
2 10.3 503.00 494.79 5.78 8.20 0.98 
3 nee way, 487.00 492.12 6.49 —5.12 —-0.66 
4 16.6 470.00 476.93 6.10 —6.93 —0.85 
5 19.8 458.00 446.395.69 1.61 1238 
6 22:.'0 412.00 416.99 8.75 —4.99 —0.97 

StDev 

Fit Fit 95.0% CI 95.0% PI 
491.10 6.52 (470.36, 511.83) (452.71, 529.48) 


a. Does a scatterplot of the data appear consistent with 
the quadratic regression model? 

b. What proportion of observed variation in density can 
be attributed to the model relationship? 

ce. Calculate a 95% CI for true average density when 
moisture content is 13.7. 

d. The last line of output is from a request for estimation 
and prediction information when moisture content 
is 14. Calculate a 99% PI for density when moisture 
content is 14. 

e. Does the quadratic predictor appear to provide useful 
information? Test the appropriate hypotheses at sig- 
nificance level .05. 


27. The following data on y = glucose concentration (g/L) 
and x = fermentation time (days) for a particular blend 
of malt liquor was read from a scatterplot in the article 
“Improving Fermentation Productivity with Reverse 
Osmosis” (Food Tech., 1984: 92-96): 


x| 1 2 3 4 > 6 7 8 


I 74 54 52 51 52 33 58 71 


a. Verify that a scatterplot of the data is consistent with 
the choice of a quadratic regression model. 

b. The estimated quadratic regression equation is 
y = 84.482 — 15.875x + 1.7679x?. Predict the value 
of glucose concentration for a fermentation time of 
6 days, and compute the corresponding residual. 

ce. Using SSE = 61.77, what proportion of observed 
variation can be attributed to the quadratic regression 
relationship? 


d. The n = 8 standardized residuals based on the qua- 
dratic model are 1.91, —1.95, —.25, .58, .90, .04, 


28. 


29. 
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—.66, and .20. Construct a plot of the standardized 
residuals versus x and a normal probability plot. Do 
the plots exhibit any troublesome features? 

e. The estimated standard deviation of fy.,—that is, 
By + B,(6) + B,(36)—is 1.69. Compute a 95% CI for 
My.6- 

f. Compute a 95% PI for a glucose concentration 
observation made after 6 days of fermentation time. 


The viscosity (y) of an oil was measured by a cone and 
plate viscometer at six different cone speeds (x). It was 
assumed that a quadratic regression model was appropri- 
ate, and the estimated regression function resulting from 
the n = 6 observations was 


y = —113.0937 + 3.3684x — .01780x? 


a. Estimate jy.7;, the expected viscosity when speed is 
75 rpm. 

b. What viscosity would you predict for a cone speed of 
60 rpm? 

ce. If Ly? = 8386.43, Ly, = 210.70, x,y; = 17,002.00, 
and >x?y, = 1,419,780, compute SSE [= Ly? — 
Boy; = B,>xy; = B,=x?y)] and s. 

d. From part (c), SST = 8386.43 — (210.70)*/6 = 987.35. 
Using SSE computed in part (c), what is the computed 
value of R?? 

e. If the estimated standard deviation of B, is 
5g, = 00226, test Hy: B, = 0 versus H,: B, # 0 at 
level .01, and interpret the result. 


High-alumina refractory castables have been extensively 
investigated in recent years because of their significant 
advantages over other refractory brick of the same 
class—lower production and application costs, versatility, 
and performance at high temperatures. The accompany- 
ing dataonx = viscosity (MPa - s) and y = free-flow (%) 
was read from a graph in the article “Processing of Zero- 
Cement Self-Flow Alumina Castables” (The Amer. 
Ceramic Soc. Bull., 1998: 60-66): 


x | 351 367 = =©9373~— 400 402 456 484 
y | 81 83 79 75 70 43 22 


The authors of the cited paper related these two variables 

using a quadratic regression model. The estimated regres- 

sion function is y = —295.96 + 2.1885x — .0031662x?. 

a. Compute the predicted values and residuals, and then 
SSE and s?. 

b. Compute and interpret the coefficient of multiple 
determination. 

c. The estimated SD of B, is 5g, = .0004835. Does the 
quadratic predictor belong in the regression model? 

d. The estimated SD of B, is .4050. Use this and the 
information in (c) to obtain joint CIs for the linear 
and quadratic regression coefficients with a joint 
confidence level of (at least) 95%. 

e. The estimated SD of fy, 499 is 1.198. Calculate a 95% 
CI for true average free-flow when viscosity = 400 
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30. 


Predictor 
Constant 
temp 
tempsqd 


S= 3.44398 


and also a 95% PI for free-flow resulting from a 
single observation made when viscosity = 400, and 
compare the intervals. 


The accompanying data was extracted from the article 
“Effects of Cold and Warm Temperatures on 
Springback of Aluminum-Magnesium Alloy 5083- 
H111” (J. of Engr. Manuf., 2009: 427-431). The 
response variable is yield strength (MPa), and the predic- 
tor is temperature (°C). 


x | —30 25 100 200 300 


y | 91.0 120.5 136.0 133.1 120.8 


Here is Minitab output from fitting the quadratic regres- 
sion model (a graph in the cited paper suggests that the 
authors did this): 


Coef 

LAD 277 
0.32845 
—0.0010050 


SE Coef T P 
2.100 52.98 0.000 
0.03303 9.94 0.010 
0.0001213 —-8.29 0.014 


R-Sq = 98.1% R-Sq(adj) = 96.3% 


Analysis of Variance 


Source DF SS MS F P 
Regression 2 1245.39 622.69 52.50 0.019 
Residual Error 2 23°72 11.86 

Total 4 1269.11 


a. What proportion of observed variation in strength 
can be attributed to the model relationship? 

b. Carry out a test of hypotheses at significance level .05 
to decide if the quadratic predictor provides useful 
information over and above that provided by the lin- 
ear predictor. 


The regression equation is 


y=-1344+12.7 x—0.377 x**2+0.00359 x**3 


Predictor Coef SE Coef T P 
Constant =133:.787 8.048 —16.62 0.000 
x 12.7423 0.7750 16.44 0.000 
xeeD =0.3'7652 0.02444 —15.41 0.000 
x**3 0.0035861 0.0002529 14.18 0.000 
S=0.168354 R-Sq = 98.0% R-Sq(adj) = 97.7% 
Analysis of Variance 

Source DF ss MS F P 
Regression 3 27.9744 9.3248 329.00 0.000 
Residual Error 20 0.5669 0.0283 

Total 23 28.5413 


a. What proportion of observed variation in energy 
output can be attributed to the model relationship? 

b. Fitting a quadratic model to the data results in 

? = .780. Calculate adjusted R? for this model and 
compare to adjusted R? for the cubic model. 

c. Does the cubic predictor appear to provide useful 
information about y over and above that provided by 
the linear and quadratic predictors? State and test the 
appropriate hypotheses. 

d. When x = 30, sp = .0611. Obtain a 95% CI for true 
average energy output in this case, and also a 95% PI 
for a single energy output to be observed when tem- 
perature difference is 30. [Hint: sy = .0611.] 

e. Interpret the hypotheses Ho: by.3; = 5 versus H,: 
My.35 # 5, and then carry out a test at significance 
level .05 using the fact that when x = 35, sy = .0523. 


. 32. The following data is a subset of data obtained in an 
c. For a strength value of 100, ¥ = 134.07, sy = 2.38. experiment to study the relationship between x = soil pH 
Estimate true average strength when temperature is and y = Al Concentration/EC (“Root Responses of 
100, in a way that conveys information about preci- Three Gramineae Species to Soil Acidity in an Oxisol 
sion and reliability. and an Ultisol,” Soil Science, 1973: 295-302): 
d. Use the information in (c) to predict strength for a 
single observation to be made when temperature is x 4.01 4.07 4.08 4.10 4.18 
100, and do so in a way that conveys information y 1.20 78 83 98 65 
about precision and reliability. Then compare this 
prediction to the estimate obtained in (c). - 4.20 4.23 427 4.30 441 
31. The accompanying data on y = energy output (W) and 
x = temperature difference (°K) was provided by the y AG “0 ai = 20 
authors of the article “Comparison of Energy and a 4.45 4.50 4.58 4.68 470 4.77 
Exergy Efficiency for Solar Box and Parabolic 
Cookers” (J. of Energy Engr., 2007: 53-62). y 20 24 10 13 07 04 
The article’s authors fit a cubic regression model to the A cubic model was proposed in the article, but the ver- 
data. Here is Minitab output from such a fit. sion of Minitab used by the author of the present text 
x 23.20 23.50 23.52 24.30 25.10 26.20 27.40 28.10 29.30 30.60 31.50 32.01 
y 3.78 4.12 4.24 5.35 5.87 6.02 6.12 6.41 6.62 6.43 6.13 5.92 
x 32.63 33.23 33.62 34.18 35.43 35.62 36.16 36.23 36.89 37.90 39.10 41.66 
y 5.64 5.45 5.21 4.98 4.65 4.50 4.34 4.03 3.92 3.65 3.02 2.89 
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refused to include the x* term in the model, stating that 
“x3 is highly correlated with other predictor variables.” 
To remedy this, x = 4.3456 was subtracted from each x 
value to yield x’ = x — x. A cubic regression was then 
requested to fit the model having regression function 


y = Bo + Bix’ + Bae’? + B3@')? 


The following computer output resulted: 


Parameter Estimate Estimated SD 
Be 3463 .0366 
p* — 1.2933 2535 
Bs 2.3964 5699 
BS —2.3968 2.4590 


a. What is the estimated regression function for the “‘cen- 
tered” model? 

b. What is the estimated value of the coefficient 6, in 
the “uncentered” model with regression function 
y = By + Byx + Box? + B,x°? What is the estimate 
of B,? 

c. Using the cubic model, what value of y would you 
predict when soil pH is 4.5? 


d. Carry out a test to decide whether the cubic term 
should be retained in the model. 


In many polynomial regression problems, rather than 
fitting a “centered” regression function using x’ = x — x, 
computational accuracy can be improved by using a 
function of the standardized independent variable 
x’ = (x — x)/s,, where s, is the standard deviation of the 
x;'s. Consider fitting the cubic regression function 
y = By + Bix’ + B35’) + B(x’)? to the following data 
resulting from a study of the relation between thrust 
efficiency y of supersonic propelling rockets and the 
half-divergence angle x of the rocket nozzle (“More on 
Correlating Data,’ CHEMTECH, 1976: 266-270): 


x | 5 10 15 20 25 30 35 


y | 985 


996 988 962 .940 915  .878 
Parameter Estimate Estimated SD 
Bo 9671 .0026 
By — .0502 .0051 
B> —.0176 .0023 
B; .0062 .0031 


a. What value of y would you predict when the half- 
divergence angle is 20? When x = 25? 

b. What is the estimated regression function 
Bo + B.x + Bx? + B,x3 for the “unstandardized” 
model? 

c. Use a level .05 test to decide whether the cubic term 
should be deleted from the model. 


34. 


35. 
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d. What can you say about the relationship between 
SSEs and R?’s for the standardized and unstandard- 
ized models? Explain. 

e. SSE for the cubic model is .00006300, whereas for a 
quadratic model SSE is .00014367. Compute R? for 
each model. Does the difference between the two 
suggest that the cubic term can be deleted? 


The following data resulted from an experiment to assess 
the potential of unburnt colliery spoil as a medium for 
plant growth. The variables are x = acid extractable cat- 
ions and y = exchangeable acidity/total cation exchange 
capacity (“Exchangeable Acidity in Unburnt Colliery 
Spoil,” Nature, 1969: 161): 


y! 91 78 .69 52 48 55 


Standardizing the independent variable x to obtain 
x’ =(x—x)/s, and fitting the regression function 
y = BX + BX x'+ BS (x')? yielded the accompanying 
computer output. 


Parameter Estimate Estimated SD 
Bs 8733 .0421 
By —.3255 .0316 
BE 0448 .0319 


a. Estimate py.<9. 

b. Compute the value of the coefficient of multiple 
determination. (See Exercise 28(c).) 

c. Whatis the estimated regression function Bo Be Byx + 
Box? using the unstandardized variable x? 

d. What is the estimated standard deviation of B, com- 
puted in part (c)? 

e. Carry out a test using the standardized estimates to 
decide whether the quadratic term should be retained 
in the model. Repeat using the unstandardized esti- 
mates. Do your conclusions differ? 


The article ‘The Respiration in Air and in Water of 
the Limpets Patella caerulea and Patella lusitanica’’ 
(Comp. Biochemistry and Physiology, 1975: 407-411) 
proposed a simple power model for the relationship 
between respiration rate y and temperature x for P. cae- 
rulea in air. However, a plot of In(y) versus x exhibits a 
curved pattern. Fit the quadratic power model 
Y = ae®** - € to the accompanying data. 


x 10 15 20 25 30 


y | 37.1 70.1 109.7 177.2 222.6 
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13.4 Multiple Regression Analysis 


In multiple regression, the objective is to build a probabilistic model that relates 
a dependent variable y to more than one independent or predictor variable. Let k 
represent the number of predictor variables (k = 2) and denote these predictors by 
X1,X,..., X, For example, in attempting to predict the selling price of a house, we might 
have k = 3 with x, = size (ft”), x, = age (years), and x, = number of rooms. 


DEFINITION The general additive multiple regression model equation is 
Y= Boat Bix a Box te en oe Bt, Te € (13.15) 


where E(€) = 0 and V(e) = o%. In addition, for purposes of testing hypotheses 
and calculating Cls or PIs, it is assumed that € is normally distributed. 


Let x}*, x¥,..., x be particular values of x,, ...,.x,. Then (13.15) implies that 


Myce, xe = Bo + Bix, + + + Byxy (13.16) 


Thus just as By + B,x describes the mean Y value as a function of x in simple linear 
regression, the true (or population) regression function By + B,x,; + --- + B,x, 
gives the expected value of Y as a function of x,,...,x,. The B,’s are the true (or 
population) regression coefficients. The regression coefficient 6, is interpreted as 
the expected change in Y associated with a 1-unit increase in x, while x), ..., x, are 
held fixed. Analogous interpretations hold for B,,..., By. 


Models with Interaction and Quadratic 
Predictors 


If an investigator has obtained observations on y, x,, and x,, one possible model is 
Y = By + Bx, + B,x, + €. However, other models can be constructed by forming 
predictors that are mathematical functions of x, and/or x,. For example, with x, = x7 
and x, = x,x,, the model 


Y= By + Bix, + Box, + Bax, + Bix, + € 


has the general form of (13.15). In general, it is not only permissible for some predic- 
tors to be mathematical functions of others but also often highly desirable in the sense 
that the resulting model may be much more successful in explaining variation in y 
than any model without such predictors. This discussion also shows that polynomial 
regression is indeed a special case of multiple regression. For example, the quad- 
ratic model Y = By + B,x + Bx? + € has the form of (13.15) with k = 2,x, =x, 
and x, = x”. 

For the case of two independent variables, x, and x,, consider the following 
four derived models. 


1. The first-order model: 
Y= By + Bix, + Box, + € 
2. The second-order no-interaction model: 


Y= By + Bx, + Box. + Byxt + Byxg + € 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


13.4 Multiple Regression Analysis 573 


3. The model with first-order predictors and interaction: 
Y= By + Bix, + Box. + Byx 1x, + € 
4. The complete second-order or full quadratic model: 
Y = By + Byx, + Box, + Bx + Byxz + Bsx1x, + € 


Understanding the differences among these models is an important first step in building 
realistic regression models from the independent variables under study. 

The first-order model is the most straightforward generalization of simple 
linear regression. It states that for a fixed value of either variable, the expected 
value of Y is a linear function of the other variable and that the expected change in Y 
associated with a unit increase in x, (x5) is B, (85) independent of the level of x, (x,). 
Thus if we graph the regression function as a function of x, for several different 
values of x,, we obtain as contours of the regression function a collection of parallel 
lines, as pictured in Figure 13.13(a). The function y = By + B,x, + Bx, specifies 
a plane in three-dimensional space; the first-order model says that each observed 
value of the dependent variable corresponds to a point which deviates vertically 
from this plane by a random amount e. 

According to the second-order no-interaction model, if x, is fixed, the expected 
change in Y for a |-unit increase in x, is 


Bo + By + 1) + Boxy + B3(x, + ie Bux5 
— (By + Bix, + Box, + B3x} + B4Xx3) = B, + Bs + 283x, 


Because this expected change does not depend on x,, the contours of the regression 
function for different values of x, are still parallel to one another. However, the 
dependence of the expected change on the value of x, means that the contours are 
now curves rather than straight lines. This is pictured in Figure 13.13(b). In this case, 
the regression surface is no longer a plane in three-dimensional space but is instead 
a curved surface. 

The contours of the regression function for the first-order interaction model 
are nonparallel straight lines. This is because the expected change in Y when x, is 
increased by | is 


Bo + By, + 1) + Box, + By, + Dx, 
— (By + Bix, + Box, + B3x,x2) = B, + B3x, 


This expected change depends on the value of x,, so each contour line must 
have a different slope, as in Figure 13.13(c). The word interaction reflects the fact 
that an expected change in Y when one variable increases in value depends on the 
value of the other variable. 

Finally, for the complete second-order model, the expected change in Y when 
X, 1s held fixed while x, is increased by | unit is B, + B3; + 263x, + 85x, which is 
a function of both x, and x,. This implies that the contours of the regression function 
are both curved and not parallel to one another, as illustrated in Figure 13.13(d). 

Similar considerations apply to models constructed from more than two inde- 
pendent variables. In general, the presence of interaction terms in the model implies 
that the expected change in Y depends not only on the variable being increased or 
decreased but also on the values of some of the fixed variables. As in ANOVA, it 
is possible to have higher-way interaction terms (e.g., x,x,x;), making model inter- 
pretation more difficult. 
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E(Y) 


20 4 104 x= 2 


10 + 


0 T T T 
5 10 15 


(c) E(Y) = -1 + 5x) — x +xyx (d) E(Y) 1 + 5x, + .25x7 — x5 + 5x34 xyx5 
1 2 12: 1 1 2 2 1°2 


Figure 13.13 Contours of four different regression functions 


Note that if the model contains interaction or quadratic predictors, the generic 
interpretation of a 8, given previously will not usually apply. This is because it is not 
then possible to increase x; by 1 unit and hold the values of all other predictors fixed. 


Models with Predictors for 
Categorical Variables 


Thus far we have explicitly considered the inclusion of only quantitative (numerical) 
predictor variables in a multiple regression model. Using simple numerical coding, 
qualitative (categorical) variables, such as bearing material (aluminum or copper/ 
lead) or type of wood (pine, oak, or walnut), can also be incorporated into a model. 
Let’s first focus on the case of a dichotomous variable, one with just two possible 
categories—male or female, U.S. or foreign manufacture, and so on. With any such 
variable, we associate a dummy or indicator variable x whose possible values 0 
and 1 indicate which category is relevant for any particular observation. 


EXAMPLE 13.11 The article “Estimating Urban Travel Times: A Comparative Study” (Trans. 
Res., 1980: 173-175) described a study relating the dependent variable y = travel 
time between locations in a certain city and the independent variable x, = distance 
between locations. Two types of vehicles, passenger cars and trucks, were used in 
the study. Let 


1 if the vehicle is a truck 
x, = 
: 0 if the vehicle is a passenger car 
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One possible multiple regression model is 
Y= By + Bix, + Box, + € 
The mean value of travel time depends on whether a vehicle is a car or a truck: 
mean time = B, + Bx, when x, = 0 (cars) 
mean time = By) + B, + B.x, whenx, = 1 (trucks) 

The coefficient 6, is the difference in mean times between trucks and cars with 
distance held fixed; if 6, > 0, on average it will take trucks longer to traverse any 
particular distance than it will for cars. 
A second possibility is a model with an interaction predictor: 

Y= By + Bix, + Box, + Byx 1x, + € 
Now the mean times for the two types of vehicles are 


mean time = B, + B,x, when x, = 0 
mean time = By + B, + (B, + B;)x, whenx, = | 


For each model, the graph of the mean time versus distance is a straight line for 
either type of vehicle, as illustrated in Figure 13.14. The two lines are parallel for 
the first (no-interaction) model, but in general they will have different slopes when 
the second model is correct. For this latter model, the change in mean travel time 
associated with a 1-mile increase in distance depends on which type of vehicle is 
involved—the two variables “vehicle type” and “travel time” interact. Indeed, data 
collected by the authors of the cited article suggested the presence of interaction. 


Mean y Mean y 


(a) (b) 


Figure 13.14 Regression functions for models with one dummy variable (x,) and one 
quantitative variable x,: (a) no interaction; (b) interaction iia 


You might think that the way to handle a three-category situation is to define 
a single numerical variable with coded values such as 0, 1, and 2 corresponding to 
the three categories. This is incorrect, because it imposes an ordering on the catego- 
ries that is not necessarily implied by the problem context. The correct approach to 
incorporating three categories is to define two different dummy variables. Suppose, 
for example, that y is the lifetime of a certain cutting tool, x, is cutting speed, and 
that there are three brands of tool being investigated. Then let 


{ if a brand A tool is used F if a brand B tool is used 
xX, = x3 = 


QO otherwise QO otherwise 
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When an observation on a brand A tool is made, x, = 1 and x; = 0, whereas for 
a brand B tool, x, = 0 and x; = 1. An observation made on a brand C tool has 
X = x3; = 0, and it is not possible that x, = x, = | because a tool cannot simultane- 
ously be both brand A and brand B. The no-interaction model would have only the 
predictors x,, x», and x,. The following interaction model allows the mean change 
in lifetime associated with a 1-unit increase in speed to depend on the brand of tool: 


Y= Bo + Bix, + Boxy + B3x3 + Byx ix. + B5x 1x3 + € 


Construction of a picture like Figure 13.14 with a graph for each of the three possible 
(x5, x3) pairs gives three nonparallel lines (unless B, = B; = 0). 

More generally, incorporating a categorical variable with c possible categories 
into a multiple regression model requires the use of c — 1 indicator variables (e.g., 
five brands of tools would necessitate using four indicator variables). Thus even one 
categorical variable can add many predictors to a model. 


Estimating Parameters 


The data in simple linear regression consists of n pairs (x), y,), ..., (X, Y,). Suppose 
that a multiple regression model contains two predictor variables, x, and x). 
Then the data set will consist of 1 triples (1,2), ¥,), (X12. X29 Va)s -++s Kins Kons Vp) 
Here the first subscript on x refers to the predictor and the second to the observa- 
tion number. More generally, with k predictors, the data consists of n (k + 1)-tuples 
(Hp ys X25 0+ Xeps Vys p25 X95 +++ Kyas Vadoeees ins Kaye +s Xin Yn)» Where x, is the 
value of the ith predictor x; associated with the observed value y; The observations 
are assumed to have been obtained independently of one another according to the 
model (13.15). To estimate the parameters Bo, 6,,..., 8, using the principle of least 
squares, form the sum of squared deviations of the observed y,’s from a trial function 
yHbot bx, to thx 


Sho, by, ..-, dy) = y [yj — (By + byxyj + Boxy + ++ + Dyx_)P (13.17) 
j 
The least squares estimates are those values of the b,’s that minimize f(b,,..., b,). 
Taking the partial derivative of f with respect to each b{i = 0, 1,..., k) and equating 
all partials to zero yields the following system of normal equations: 


bon + bx; + by > ky Se hy, Ste = >»; 
by DX + By Dixy + by Dex yry to + BD xyy = Dep; 


(13.18) 


a> ey + Dy ate fot DAD teas + be > Xe a pty 


These equations are linear in the unknowns Jp, b,,..., b,. Solving (13.18) yields the 
least squares estimates Bos B,, ee Be This is best done by utilizing a statistical soft- 
ware package. 


EXAMPLE 13.12 The article ‘How to Optimize and Control the Wire Bonding Process: Part IT’ 
(Solid State Technology, Jan. 1991: 67-72) described an experiment carried out to assess 
the impact of the variables x, = force (gm), x, = power (mW), x, = tempertaure (°C), 
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and x, = time (msec) on y = ball bond shear strength (gm). The following data* was 
generated to be consistent with the information given in the article: 


Observation Force Power Temperature Time Strength 
1 30 60 175 15 26.2 
2 40 60 175 15 26.3 
3 30 90 175 15 39.8 
4 40 90 175 15 39.7 
5 30 60 225 15 38.6 
6 40 60 225 15 35.5 
7 30 90 225 15 48.8 
8 40 90 225 15 37.8 
9 30 60 175 25 26.6 

10 40 60 175 25 23.4 
11 30 90 175 25 38.6 
12 40 90 175 25 52 ll 
13 30 60 225 25 39.5 
14 40 60 225 25 32.3 
15 30 90 225 25 43.0 
16 40 90 225 25 56.0 
17 25 715 200 20 35.2 
18 45 75 200 20 46.9 
19 35 45 200 20 22.7 
20 35 105 200 20 58.7 
21 35 75 150 20 34.5 
22 35 WD 250 20 44.0 
23 35 75 200 10 35.7 
24 35 75 200 30 41.8 
25 35 75 200 20 36.5 
26 35 75 200 20 37.6 
27 35 75 200 20 40.3 
28 35 75 200 20 46.0 
29 35 75 200 20 27.8 
30 35 75 200 20 40.3 


A Statistical computer package gave the following least squares estimates: 


8, = -37.48 6,=.2117 8,=.4983 f,=.1297 B, =.2583 


Thus we estimate that .1297 gm is the average change in strength associated with a 
1-degree increase in temperature when the other three predictors are held fixed; the 
other estimated coefficients are interpreted in a similar manner. 

The estimated regression equation is 


y = —37.48 + .2117x, + .4983x, + .1297x, + .2583x, 


A point prediction of strength resulting from a force of 35 gm, power of 75 mW, 
temperature of 200° degrees, and time of 20 msec is 


jy = —37.48 + (.2117)(35) + (.4983)(75) + (.1297)(200) + (.2583)(20) 
= 38.41 gm 


* From the book Statistics Engineering Problem Solving by Stephen Vardeman, an excellent exposition 
of the territory covered by our book, albeit at a somewhat higher level. 
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This is also a point estimate of the mean value of strength for the specified values of 
force, power, temperature, and time. a 


R2 and o2 


Predicted or fitted values, residuals, and the various sums of squares are calculated 
as in simple linear and polynomial regression. The predicted value }, results from 
substituting the values of the various predictors from the first observation into the 
estimated regression function: 


dy, = Bo + ByXy) + BoXg) + + BX yy 


The remaining predicted values },,...,, come from substituting values of the 
predictors from the 2nd, 3rd,..., and finally nth observations into the estimated 
function. For example, the values of the 4 predictors for the last observation in 
Example 13.12 are x; 35 = 35, X39 = 75, X3.39 = 200, and x, 39 = 20, so 


J3q = —37.48 + .2117(35) + .4983(75) + .1297(200) + .2583(20) = 38.41 


The residuals y, — },,..., y,, — 3, are the differences between the observed and pre- 
dicted values. The last residual in Example 13.12 is 40.3 — 38.41 = 1.89. The closer 
the residuals are to 0, the better the job our estimated regression function is doing in 
making predictions corresponding to observations in the sample. 

Error or residual sum of squares is SSE = =(y, — 5,)?. It is again interpreted 
as a measure of how much variation in the observed y values is not explained by 
(not attributed to) the model relationship. The number of df associated with SSE 
is n—(k+ 1) because k + 1 df are lost in estimating the k + 1 B coefficients. 
Total sum of squares, a measure of total variation in the observed y values, is 
SST = Gy, — y)?. Regression sum of squares SSR = =(}, — y)? = SST — SSE is 
a measure of explained variation. Then the coefficient of multiple determination 
R’ is 


R? = 1 — SSE/SST = SSR/SST 


It is interpreted as the proportion of observed y variation that can be explained by 
the multiple regression model fit to the data. 

Because there is no preliminary picture of multiple regression data analogous 
to a scatterplot for bivariate data, the coefficient of multiple determination is our 
first indication of whether the chosen model is successful in explaining y variation. 
Unfortunately, there is a problem with R?: Its value can be inflated by adding lots of 
predictors into the model even if most of these predictors are rather frivolous. For 
example, suppose y is the sale price of a house. Then sensible predictors include 
x, = the interior size of the house, x, = the size of the lot on which the house sits, 
Xx, = the number of bedrooms, x, = the number of bathrooms, and x; = the house’s 
age. Now suppose we add in x, = the diameter of the doorknob on the coat closet, 
x, = the thickness of the cutting board in the kitchen, x, = the thickness of the patio 
slab, and so on. Unless we are very unlucky in our choice of predictors, using n — 1 
predictors (one fewer than the sample size) will yield R? = 1. So the objective in 
multiple regression is not simply to explain most of the observed y variation, but to 
do so using a model with relatively few predictors that are easily interpreted. It is 
thus desirable to adjust R*, as was done in polynomial regression, to take account of 
the size of the model: 


SSE/(n — (k + 1) n-1 SSE 
. SST/(n — 1) n—-(k+1) SST 
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Because the ratio in front of SSE/SST exceeds 1, R2 is smaller than R?. Furthermore, 
the larger the number of predictors k relative to the sample size n, the smaller R? will 
be relative to R*. Adjusted R? can even be negative, whereas R? itself must be between 
0 and 1. A value of R? that is substantially smaller than R? itself is a warning that the 
model may contain too many predictors. 

The positive square root of R? is called the multiple correlation coefficient and 
is denoted by R. It can be shown that R is the sample correlation coefficient calculated 
from the (¥,, y,) pairs (that is, use }, in place of x, in the formula for r from Section 12.5). 

SSE is also the basis for estimating the remaining model parameter: 

SSE 


== = MSE 
n—(k+ 1) 


EXAMPLE 13.13 Investigators carried out a study to see how various characteristics of concrete are 
influenced by x, = % limestone powder and x, = water-cement ratio, resulting in the 
accompanying data (“Durability of Concrete with Addition of Limestone Powder,” 
Magazine of Concrete Research, 1996: 131-137). 


x; x, XX, 28-day Comp Str. (MPa) Adsorbability (%) 
21 65 13.65 33.55 8.42 
21 aD 11.55 47.55 6.26 
7 .65 4.55 35.00 6.74 
7 oe) 3.85 35.90 6.59 
28 .60 16.80 40.90 7.28 
0 .60 0.00 39.10 6.90 
14 .70 9.80 31.55 10.80 
14 50 7.00 48.00 5.63 
14 60 8.40 42.30 7.43 


y = 39.317, SST = 278.52 y = 7.339, SST = 18.356 


Consider first compressive strength as the dependent variable y. Fitting the first- 
order model results in 


y = 84.82 + .1643x, — 79.67x,, SSE = 72.52 (df = 6), R? = .741, R? = .654 
whereas including an interaction predictor gives 
y = 6.22 + 5.779x, + 51.33x, — 9.357% x, 
SSE = 29.35 (df = 5) R? = .895 R? = 831 


Based on this latter fit, a prediction for compressive strength when % limestone = 14 
and water—cement ratio = .60 is 


¥ = 6.22 + 5.77914) + 51.33(.60) — 9.357(8.4) = 39.32 


Fitting the full quadratic relationship results in virtually no change in the R? value. 
However, when the dependent variable is adsorbability, the following results 
are obtained: R? = .747 when just two predictors are used, .802 when the interac- 
tion predictor is added, and .889 when the five predictors for the full quadratic 
relationship are used. a 


In general, B; can be interpreted as an estimate of the average change in Y 
associated with a 1-unit increase in x, while values of all other predictors are held 
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fixed. Sometimes, though, it is difficult or even impossible to increase the value 
of one predictor while holding all others fixed. In such situations, there is an alter- 
native interpretation of the estimated regression coefficients. For concreteness, 
suppose that k = 2, and let B , denote the estimate of 8, in the regression of y on the 
two predictors x, and x,. Then 


1. Regress y against just x, (a simple linear regression) and denote the resulting 
residuals by g,, g5,..., g, These residuals represent variation in y after removing 
or adjusting for the effects of x,. 


2. Regress x, against x, (that is, regard x, as the dependent variable and x, as the 
independent variable in this simple linear regression), and denote the residuals by 
Ji>--+> f, These residuals represent variation in x, after removing or adjusting for 
the effects of x,. 


Now consider plotting the residuals from the first regression against those from 
the second; that is, plot the pairs (f,, g,),..-, (fF, g,). The result is called a partial 
residual plot or adjusted residual plot. If a regression line is fit to the points in this 
plot, the slope turns out to be exactly B, (furthermore, the residuals from this line are 
exactly the residuals e,,..., e,, from the multiple regression of y on x, and x,). Thus 
B, can be interpreted as the estimated change in y associated with a 1|-unit increase 
in x, after removing or adjusting for the effects of any other model predictors. The 
same interpretation holds for other estimated coefficients regardless of the number 
of predictors in the model (there is nothing special about k = 2; the foregoing argu- 
ment remains valid if y is regressed against all predictors other than x, in Step | and 
x, is regressed against the other k — 1 predictors in Step 2). 

As an example, suppose that y is the sale price of an apartment building and 
that the predictors are number of apartments, age, lot size, number of parking spaces, 
and gross building area (ft). It may not be reasonable to increase the number of 
apartments without also increasing gross area. However, if Bs = 16.00, then we 
estimate that a $16 increase in sale price is associated with each extra square foot of 
gross area after adjusting for the effects of the other four predictors. 


A Model Utility Test 


The absence of an informative picture of multivariate data and the aforementioned 
difficulty with R* provide compelling reasons for seeking a formal test of model 
utility. The model utility test in simple linear regression involved the null hypothesis 
HH: 8, = 9, according to which there is no useful relation between y and the single 
predictor x. Here we consider the assertion that 6, = 0, B, = 0,..., 8, = 0, which 
says that there is no useful relationship between y and any of the k predictors. If at 
least one of these B’s is not 0, the corresponding predictor(s) is (are) useful. The test 
is based on a statistic that has a particular F distribution when H) is true. 


Null hypothesis: Hp: B; = B, =--- = B, =0 
Alternative hypothesis: H,,: at least one B; # 0 (Gi = 1,..., k) 
a R*/k 
Test statistic value: f = 5 
GR kt 1) 
SSR/k MSR 


~ SSE/[n —(k+1)] MSE ae 


where SSR = regression sum of squares = SST — SSE 
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When H) is true, the test statistic F has an F distribution with k numerator df 
and n — (k + 1) denominator df. The test is upper-tailed, so the P-value is the 
area under the Fy, ,_ + , curve to the right of f. 


Except for a constant multiple, the test statistic here is R?/(1 — R?), the ratio of 
explained to unexplained variation. If the proportion of explained variation is high 
relative to unexplained, we would naturally want to reject H, and confirm the utility 
of the model; this explains why the test is upper-tailed (only large values of f argue 
against H,). However, if k is large relative to n, the factor [(n — (k + 1))/k] will 
decrease f considerably. 


EXAMPLE 13.14 Returning to the bond shear strength data of Example 13.12, a model with k = 4 
predictors was fit, so the relevant hypotheses are 


Ay: B, = B, = B; = By = 9 


H,: at least one of these four B’s is not 0 


Figure 13.15 shows output from the JMP statistical package. The values of s (Root 
Mean Square Error), R?, and adjusted R certainly suggest a useful model. The value 
of the model utility F ratio is 
R?/k .713959/4 
(1 — R?)/[n-(k + 1)] — .286041/(30 — 5) 


f = 15.60 


Response: strength 


RSquare 0.713959 
RSquare Adj 0.668193 
Root Mean Square Error 5.157979 
Mean of Response 38.40667 
Observations (or Sum Wgts) 30 


Parameter Estimates 


Term Estimate Std Error tRatio Prob>|t| 
Intercept -37.47667 13.09964 -2.86 0.0084 
force 0.2116667 0.210574 1.01 0.3244 
power 0.4983333 0.070191 7.10 <.0001 
temp 0.1296667 0.042115 3.08 0.0050 
time 0.2583333 0.210574 1.23 0.2313 


Whole-Model Test 
Analysis of Variance 


Source DF Sumof Squares Mean Square F Ratio 
Model 4 1660.1400 415.035 15.6000 
Error 25 665.1187 26.605  Prob>F 
C Total 29 2325.2587 <.0001 


Figure 13.15 Multiple regression output from JMP for the data of Example 13.14 
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This value also appears in the F Ratio column of the ANOVA table in Figure 13.15. 
The largest F critical value for 4 numerator and 25 denominator df in Appendix 
Table A.9 is 6.49, which captures an upper-tail area of .001. Thus P-value < .001. 
The ANOVA table in the JMP output shows that P-value < .0001. This is a highly 
significant result. The null hypothesis should be rejected at any reasonable signifi- 
cance level. We conclude that there is a useful linear relationship between y and at 
least one of the four predictors in the model. This does not mean that all four predic- 
tors are useful; we will say more about this subsequently. a 


Inferences in Multiple Regression 


Before testing hypotheses, constructing CI’s, and making predictions, the adequacy 
of the model should be assessed and the impact of any unusual observations inves- 
tigated. Methods for doing this are described at the end of the present section and 
in the next section. 

Because each B; is a linear function of the y,’s, the standard deviation (stand- 
ard error) of each B; is the product of o and a function of the x,’s. An estimate sg of 
this SD is obtained by substituting s for 7. The function of the x;;’s is quite compli- 
cated, but all widely used statistical software packages compute and show the s’s. 
Inferences concerning a single 6; are based on the standardized variable 


= B; =; 
Sp 
which has a ¢ distribution with n — (k + 1) df. 

The point estimate of py. ,-, the expected value of Y when x, = Siesta 
Xp = XE, 1S Pyne xe = Bo + Bixi + + + B,xg. The estimated standard devia- 
tion of the corresponding estimator is again a complicated expression involving the 
sample x,;’s. However, appropriate software will calculate it on request. Inferences 


about Mynx are based on standardizing its estimator to obtain a f variable having 
n — (k + 1) df. 


T 


i 


1. A 100(1 — a)% CI for B,, the coefficient of x, in the regression function, is 
B se Lyn (hel) © SB, 


2. A test for Hp: B; = Bj uses the ¢ statistic value t = (B; = Bio)/Sp, based 
on n — (k + 1) df. The test is upper-, lower-, or two-tailed according to 
whether H, contains the inequality > ,< , or ~. The most frequently tested 
null hypothesis in practice is H,: 8; = 0. The interpretation is that as long 
as all the other predictors x,, ...,x; — 1, x, + 1, ... , x, remain in the model, 
the predictor x; provides no additional useful information about y. The cus- 
tomary alternative is H,: B; # 0, according to which x; does provide useful 
information over and above what is contained in the other i — | predictors. 
The test statistic value is the ¢ ratio B,/ 5g, the ratio of the estimated coef- 
ficient to its estimated standard error. 


3. A 10011 — @)% CI for py. » is 
Bey.ce og © ba/rn—ce+iy’ {estimated SD of faye ae } = IF bayrn—etsy'S¥ 


where Y is the statistic By + Bx, + --- + B,x1 and } is the calculated value 
Olek 
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4. A 100(1 — a)% PI for a future y value is 
My. sis te = a/2,n—(k+1) ~ 1 ar (estimated SD of Pye 
== Tpseua Vi Psp 


Simultaneous intervals for which the simultaneous confidence or prediction 
level is controlled can be obtained by applying the Bonferroni technique. 


EXAMPLE 13.15 The article “Independent but Additive Effects of Fluorine and Nitrogen Substitution 
on Properties of a Calcium Aluminosilicate Glass” (J. of the Amer. Ceramic Soc., 
2012: 600-606) used multiple regression analyses to investigate various properties of 
glasses in the Ca-Si-Al-O-N-F system. The following data on microhardness (GPa) 
resulted from various compositions of 28Ca:57:Si :15Al:(100 — x — y)O:xN:yF glasses: 


Obs N F microhardness 
1 0 0 6.1 
2 20 0 75 
3 0 1 6.2 
4 20 1 7.6 
3 40 1 8.6 
6 0 5 6.1 
7 5 5 6.4 
8 10 5 6.7 
9 15 5 6.9 

10 20 5 T2 
ll 0 0 6.1 
12 0 1 6.2 
13 0 3 6.1 
14 0 2) 6.1 
15 20 0 TS 
16 20 1 7.6 
17 20 3 Te 
18 20 BS) he. 


The model fit by the investigators was Y = By + 6,N + B,F + €. Figure 13.16 shows 
output from Minitab: 
MicroHard = 6.23 + 0.0618N — 0.0387F 


Predictor Coef SE Coef sli Pp 
Constant 6.22769 0.04615 134.93 0.000 

N 0.061823 0.002099 29.46 0.000 

F —0.03872 0.01122 —3.45 0.004 

S = 0.100051 R-Sq = 98.4% R-Sq(adj) = 98.2% 
Source DF Ss MS F P 
Regression 2 9.1348 4.5674 456.28 0.000 
Residual Error 15 0.1502 0.0100 

Total yd 9.2850 


Figure 13.16 Minitab output for Example 13.15 


In addition, when N = 20 and F = 1, 
fty.29.1 = ¥ = 6.22769 + (.061823)(20) — (.03872)(1) = 7.4254 


and the estimated standard deviation of Y for these values of the predictors is sy = .0332. 
The very high R? indicates that almost all of the observed variation in micro- 
hardness can be attributed to the model relationship and the fact that both nitrogen % 
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and fluorine % are varying. And the F ratio of 456.28 with a corresponding P-value 
of .000 in the ANOVA table resoundingly confirms the utility of the fitted model. 

Inferences about the individual regression coefficients are based on the ¢ dis- 
tribution with 18 — (2 + 1) = 15 df (degrees of freedom for error in the ANOVA 
table), and fo); ;; = 2.131. A 95% CI for B, is 


By * top5,158, = -061823 + (2.131)(.002099) 
= .061823 + .004473 ~ (.0573,.0663) 


We estimate that the expected change in microhardness associated with an increase 
of 1% in N while holding F fixed is between .0573 GPa and .0663 GPa. A similar 
calculation gives (—.0626, —.0148) as a 95% CI for B,. The Bonferroni technique 
implies that the simultaneous confidence level for both intervals is at least 90%. 

The f ratio for testing Hp: 6, =0 vs. H,: B, £0 is By/se, = .061823/.002099 = 
29.46. The corresponding P-value is twice the area under the ¢,,; curve to the right of 
29.46, which according to Minitab output is .000. Thus even with F remaining in the 
model, the predictor N provides additional useful information about microhardness. 
The evidence for testing Hy: B, = 0 versus H,: B, ~ 0 is not quite so compelling; 
Figure 13.16 shows the P-value to be .004. So at significance level .05 or .01, Hj) would 
be rejected; it appears that F also provides useful information over and above what 
is contained in N. There is no reason to delete either predictor from the model. Since 
neither 95% CI contains 0, it is no surprise that both null hypotheses are rejected at 
significance level .05. 

A 95% CI for true average hardness when N = 20 and F = 1 is 


7.4254 + (2.131)(.0332) = 7.4254 + .0707 ~ (7.35, 7.50) 


A 95% prediction interval for the hardness resulting from a single observation when 
N = 20 and F = 1 is 


7.4254 + (2.131)V(.100051)2 + (.0332)? = 7.4254 + .2246 ~ (7.20, 7.65) 


The PI is about three times as wide as the CI, reflecting the extra uncertainty in 
prediction. a 


An F Test for a Group of Predictors The model utility F test was appropriate 
for testing whether there is useful information about the dependent variable in any 
of the & predictors (i.e., whether 8B, = --- = 6, = 0). In many situations, one first 
builds a model containing k predictors and then wishes to know whether any of the 
predictors in a particular subset provide useful information about Y. For example, 
a model to be used to predict students’ test scores might include a group of back- 
ground variables such as family income and education levels and also some school 
characteristic variables such as class size and spending per pupil. One interesting 
hypothesis is that the school characteristic predictors can be dropped from the model. 

Let’s label the predictors as x,, X5,...,X), X;4),--+» X,, So that it is the last k — / 
that we are considering deleting. The relevant hypotheses are as follows: 


Ho: Bi+1 = Bi+2 = -* = By = 0 

(so the “reduced” model Y = B, + B,x, + --- + B,x, + € is correct) 
versus 
H,: at least one among £,,,;,..., 8, is not 0 


(so in the “full” model Y = By + Bx, + --- + 6,x, + €, at least 
one of the last k — / predictors provides useful information) 
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The test is carried out by fitting both the full and reduced models. Because the full model 
contains not only the predictors of the reduced model but also some extra predictors, it 
should fit the data at least as well as the reduced model. That is, if we let SSE, be the 
sum of squared residuals for the full model and SSE, be the corresponding sum for the 
reduced model, then SSE, = SSE,. Intuitively, if SSE, is a great deal smaller than SSE,, 
the full model provides a much better fit than the reduced model; the appropriate test 
statistic should then depend on the reduction SSE, — SSE, in unexplained variation. 


SSE, = unexplained variation for the full model 
SSE, = unexplained variation for the reduced model 


cae (SSE, — SSE,)/(k — 1) 
Test statistic value: f = (13.20) 
SSE Waka) 


The test is upper-tailed; the P-value is the area under the Fy,_,,, — q + 1) curve 
to the right of f- 


EXAMPLE 13.16 Soluble dietary fiber (SDF) can provide health benefits by lowering blood choles- 
terol and glucose levels. The article “Effects of Twin-Screw Extrusion on Soluble 
Dietary Fiber and Physicochemical Properties of Soybean Residue” (Food 
Chemistry, 2013: 884-889) reported the following data on y = SDF content (%) 
in soybean residue and the three predictors extrusion temperature (x,, in °C), feed 
moisture (x5, in %), and screw speed (x3, in rpm) of a twin-screw extrusion process. 


obs x, Xx x; y 
1 35 110 160 11.13 
2 25 130 180 10.98 
3 30 110 180 12.56 
4 30 130 200 11.46 
5 30 110 180 12.38 
6 30 110 180 12.43 
7 30 110 180 12.55 
8 25 110 160 10.59 
9 30 130 160 11.15 
10 30 90 200 10.55 
11 30 90 160 9.25 
12 25 90 180 9.58 
13 35 110 200 11.59 
14 35 90 180 10.68 
15 35 130 180 11.73 
16 25 110 200 10.81 
17 30 110 180 12.68 


The authors of the cited article fit the complete second-order model with 
predictors x1, x», X3, X7, X3, X3, XjX>, X)X3, and x,x;. Figure 13.17 shows Minitab out- 
put resulting from fitting this model. Note that the R? = .987, so almost all of the 
observed variation in y can be explained by the model relationship, and adjusted R? 
is only slightly smaller than R? itself. Furthermore, the F ratio for model utility is 
59.93 with a corresponding P-value of .000 (the area under the Fy, curve to the right 
of 59.93). So the null hypothesis that all nine B;’s corresponding to predictors have 
value 0 is resoundingly rejected. There appears to be a useful relationship between 
the dependent variable and at least one of the predictors. 
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The regression equation is 
y = —-132 4+: 1.69 xl + 0.777 x2 + 0.798 x3 — 0.0270 xlisqd — 0.00276 x2sqd 
— 0.00204 x3sqd — 0.000875 x1x2 + 0.000600 x1x3 — 0.000619 x2x3 


Predictor Coef SE Coef ¥ P 
Constant =131..:61 10.41 —12.64 0.000 

x1 1.6875 0.2764 6.10 0.000 

X2 0.77688 0.06683 11.62 0.000 

x3 0.79788 0.08484 9.40 0.000 
xlsqd —0.027000 0.003418 =7 590 0.000 
x2sqd —0.0027563 0.0002136 —-12.90 0.000 
x3sqd —0.0020375 0.0002136 —9.54 0.000 
x1x2 —0.0008750 0.0008767 —1.00 0.352 
x1x3 0.0006000 0.0008767 0.68 0.516 
X2X3 —0.0006188 0.0002192 —2.82 0.026 

S = 0.175347 R-Sq = 98.7% R-Sq(adj) = 97.1% 
Analysis of Variance 

Source DF SS MS F P 
Regression 9 16.5830 1.8426 59.93 0.000 
Residual Error 7 0.2152 0.0307 

Total 16 16.7982 


Source DF Seq SS 


x1 ne 1 2564 
x2 1 3.4585 
x3 1 0.6555 
xlsqd Ll 2.5869 
x2sqd 1 5, 5393 
x3sqd 1 2.7967 
x1x2 1 0.0306 
x1x3 1 0.0144 
X2x3 1 0.2450 


Figure 13.17 Minitab output from fitting the complete second-order model to the data of 
Example 13.16 


Is the inclusion of the second-order predictors justified? That is, should the 
reduced model consisting of just the predictors x,, x,, and x; (/ = 3) be used? 
The hypotheses to be tested are 


Hy By = Bs = ° = Py = 0 


versus 


H,: at least one among f,,..., By is not 0 


SSE = .2152 for the full model (from the ANOVA table of Figure 13.17). Now we 
need SSE for the reduced model that contains only the three first-order predictors 
X,, Xp, and x3. It is actually not necessary to fit this model because of the “Sequential 
Sums of Squares” information at the bottom of Figure 13.17. Each number in the 
last column gives the increase in SSR (explained variation) when another predic- 
tor is entered into the model. So SSR for the reduced model is 1.2561 + 3.4585 + 
.6555 = 5.3701. Subtracting this from SST = 16.7982 (which is the same for both 
models) gives SSE = 11.4281. The value of the F statistic is then 


(11.4281 — .2152)/3 3.7376 
.2152/[17 -(9 + 1)] _ .03074 


= 121.6 


The P-value is the area under the F;, curve to the right of this value, which unsurpris- 
ingly is 0. So the null hypothesis is resoundingly rejected. There is very convincing 
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evidence for concluding that at least one of the second-order predictors is providing 
useful information over and above what is provided by the three first-order predic- 
tors. This conclusion makes intuitive sense because the full model leaves very little 
variation unexplained (SSE quite close to 0), whereas the reduced model has a rather 
substantial amount of unexplained variation relative to SST. 

The f ratios of Figure 13.17 suggest that perhaps only the three quadratic predic- 
tors are useful and that the three interaction predictors can be eliminated. So let’s now 
consider testing Hy: B; = Bg = By = 0 against the alternative that at least one of these 
three Bs is not 0. Again the sequential sums of squares information in Figure 13.17 
allows us to obtain SSE for the reduced model (containing just the six predictors 
X1,X>, X3, X7, X5, x3) without actually fitting that model: SSR = the sum of the first six 
numbers in the Seq SS column = 16.2930, whence SSE = 16.7982 — 16.2930 = .5052. 
Then 


f = (C5052 — .2152)/3]/[.2152/17 — (6 + 1))] = 4.49 


Table A.9 gives F 953.49 = 3.71 and Fy, 3.49 = 6.55, implying that the P-value is 
between .01 and .05. In particular, at a significance level of .01, the null hypothesis 
would not be rejected. The conclusion at that level is that none of the three interac- 
tion predictors provides additional useful information. | 


Assessing Model Adequacy 


The standardized residuals in multiple regression result from dividing each residual 
by its estimated standard deviation; the formula for these standard deviations is 
substantially more complicated than in the case of simple linear regression. We 
recommend a normal probability plot of the standardized residuals as a basis for 
validating the normality assumption. If the pattern in this plot departs substantially 
from linearity, the t and F procedures developed in this section should not be used to 
make inferences. Plots of the standardized residuals versus each predictor and versus 
y should show no discernible pattern. Adjusted residual plots can also be helpful in 
this endeavor. The book by Neter et al. is an extremely useful reference. 


EXAMPLE 13.17 Figure 13.18 shows a normal probability plot of the standardized residuals for the micro- 
hardness data and fitted model given in Example 13.15. The straightness of the plot 
casts little doubt on the assumption that the random deviation € is normally distributed. 


Percent 


Standardized residual 


Figure 13.18 A normal probability plot of the standardized residuals 
for the data and model of Example 13.15 


Figure 13.19 shows the other suggested plots for the microhardness data (fewer than 
18 points appear because various observed and calculated values are duplicated). 
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Given the rather small sample size, there is not much evidence of a pattern in any 
of the first three plots other than randomness. 


Standardized residual Standardized residual 
A 
2 T e 
1 
a 
__ 
an | e 3 je] - e e 
e e . e 
24, T T T T > F —245 T T T T>N 
0 1 2 3 4 5 0 10 20 30 40 
(a) (b) 
Standardized residual Predicted 
A 
24 . 9.0 
e 
™ 8.5 
a 2 8.0 
0 oo i) es °° 
© % . 7.0 + e 
Pp 2 6.5 4 ai i 
24 | 1 1 1 1 1» Predicted value 6.0 4 1 1 1 1 +> MicroHard 
60 65 7.0 7.5 80 85 9.0 60 65 70 75 80 8.5 
(c) (d) 
Figure 13.19 Diagnostic plots for the microhardness data: (a) standardized residual versus x,; 
(b) standardized residual versus x,; (c) standardized residual versus y; (d) y versus y |_| 
EXERCISES Section 13.4 (36-54) 
36. Cardiorespiratory fitness is widely recognized as a major a. Interpret 6, and B;. 
component of overall physical well-being. Direct mea- b. What is the expected value of VO,max when weight 
surement of maximal oxygen uptake (VO,max) is the is 76 kg, age is 20 yr, walk time is 12 min, and heart 
single best measure of such fitness, but direct measure- rate is 140 b/m? 
ment is time-consuming and expensive. It is therefore c. What is the probability that VO,max will be between 
desirable to have a prediction equation for VO,max in 1.00 and 2.60 for a single observation made when the 
terms of easily obtained quantities. Consider the variables values of the predictors are as stated in part (b)? 
y = VO,max (L/min) x, = weight (kg) 37. A trucking company considered a multiple regression 
x, = age (yr) model for relating the dependent variable y = total daily 


travel time for one of its drivers (hours) to the predictors 
xX, = distance traveled (miles) and x, = the number of 
x4 = heart rate at the end of the walk (beats/min) deliveries made. Suppose that the model equation is 


Xx, = time necessary to walk 1 mile (min) 


Here is one possible model, for male students, consistent 
with the information given in the article “Validation of 
the Rockport Fitness Walking Test in College Males 


Y = —.800 + .060x, + .900x, + € 


a. What is the mean value of travel time when distance 


and Females” (Research Quarterly for Exercise and traveled is 50 miles and three deliveries are made? 
Sport, 1994: 152-158): b. How would you interpret B, = .060, the coefficient 
Y=5.0 + Olx, — .05x, — .13x, — 01x, + € of the predictor x,? What is the interpretation of 
a= 4 Bs = .900? 
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38. 


39, 


40. 


41. 


ce. Ifo = .5 hour, what is the probability that travel time 
will be at most 6 hours when three deliveries are 
made and the distance traveled is 50 miles? 


Let y = wear life of a bearing, x, = oil viscosity, and 
xX, = load. Suppose that the multiple regression model 
relating life to viscosity and load is 


Y = 125.0 + 7.75x, + .0950x, — .0090x,x, + € 


a. What is the mean value of life when viscosity is 40 
and load is 1100? 

b. When viscosity is 30, what is the change in mean life 
associated with an increase of 1 in load? When vis- 
cosity is 40, what is the change in mean life associ- 
ated with an increase of 1 in load? 


Let y = sales ata fast-food outlet (1000s of $), x, = number 
of competing outlets within a 1-mile radius, x, = population 
within a 1-mile radius (1000s of people), and x; be an 
indicator variable that equals 1 if the outlet has a drive-up 
window and 0 otherwise. Suppose that the true regression 
model is 


Y = 10.00 


1.2x, + 6.8x, + 15.3x, + € 


a. What is the mean value of sales when the number of 
competing outlets is 2, there are 8000 people within a 
1-mile radius, and the outlet has a drive-up window? 

b. What is the mean value of sales for an outlet without 
a drive-up window that has three competing outlets 
and 5000 people within a 1-mile radius? 

c. Interpret B. 


The article cited in Exercise 49 of Chapter 7 gave sum- 
mary information on a regression in which the dependent 
variable was power output (W) in a simulated 200-m race 
and the predictors were x, = arm girth (cm), x, = excess 
post-exercise oxygen consumption (ml/kg), and x, = 
immediate posttest lactate (mmol/L). The estimated 
regression equation was reported as 


y = —408.20 + 14.06x, + .76x, —3.64x; 
(n = 11, R? = 91) 


a. Carry out the model utility test using a significance 
level of .01. [Note: All three predictors were judged 
to be important. ] 

Interpret the estimate 14.06. 

c. Predict power output when arm girth is 36 cm, excess 
oxygen consumption is 120 ml/kg, and lactate is 10.0. 

d. Calculate a point estimate for true average power out- 
put when values of the predictors are as given in (c). 

e. Obtain a point estimate for the true average change 
in power output associated with a 1 mmol/L increase 
in lactate while arm girth and oxygen consumption 
remain fixed. 


The article “A Study of Factors Affecting the Human 
Cone Photoreceptor Density Measured by Adaptive 
Optics Scanning Laser Opthalmoscope” (Exptl. Eye 
Research, 2013: 1-9) included a summary of a multiple 


42. 
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regression analysis based on a sample of n = 192 eyes; 

the dependent variable was cone cell packing density 

(cells/mm), and the two independent variables were x, = 

eccentricity (mm) and x, = axial length (mm). 

a. The reported coefficient of multiple determination 
was .834. Interpret this value, and carry out a test of 
model utility. 

b. The estimated regression function was y = 
35,821.792 — 6294.729x, — 348.037x,. Calculate a 
point prediction for packing density when eccentricity is 
1 mm and axial length is 25 mm. 

c. Interpret the coefficient on x, in the estimated regres- 
sion function in (b). 

d. The estimated standard error of B, was 203.702. 
Calculate and interpret a confidence interval with 
confidence level 95% for B,. 

e. The estimated standard error of the estimated coeffi- 
cient on axial length was 134.350. Test the null hypoth- 
esis Hy: B, = 0 against the alternative H,: 8, ~ 0 using 
a significance level of .05, and interpret the result. 


An investigation of a die-casting process resulted in the 
accompanying data on x, = furnace temperature, x, = 
die close time, and y = temperature difference on the die 
surface (“A Multiple-Objective Decision-Making 
Approach for Assessing Simultaneous Improvement in 
Die Life and Casting Quality in a Die Casting Process,” 
Quality Engineering, 1994: 371-383). 


x, | 1250 1300 1350 1250 1300 
%| 6 7 6 7 6 
y | 80 95 101 85 92 
X | 1250 1300 1350 1350 

x | 8 8 7 8 

y | 87 96 106 108 


Minitab output from fitting the multiple regression 
model with predictors x, and x, is given here. 


The regression equation is 
tempdiff = —200 + 0.210 furntemp 


+ 3.00 clostime 


Predictor Coef Stdev t-ratio Pp 
Constant =—199 .56 11.64 —17.14 0.000 
furntemp 0.210000 0.008642 24.30 0.000 
clostime 3.0000 0.4321 6.94 0.000 


S$=1.058 R-sq= 99.1% R-sq(adj) = 98.8% 


Analysis of Variance 


SOURCE DF Ss MS F Pp 
Regression 2 715.50 357.75 319.31 0.000 
Error 6 6.72 142 

Total 8 722.22 


a. Carry out the model utility test. 
b. Calculate and interpret a 95% confidence interval for 
B,, the population regression coefficient of x,. 
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43. 


c. When x, = 1300 and x, = 7, the estimated stan- 
dard deviation of Y is sy = .353. Calculate a 95% 
confidence interval for true average temperature dif- 
ference when furnace temperature is 1300 and die 
close time is 7. 

d. Calculate a 95% prediction interval for the tempera- 
ture difference resulting from a single experimental 
run with a furnace temperature of 1300 and a die 
close time of 7. 


An experiment carried out to study the effect of the mole 
contents of cobalt (x,) and the calcination temperature 
(x,) on the surface area of an iron-cobalt hydroxide cata- 
lyst (y) resulted in the accompanying data (“Structural 
Changes and Surface Properties of Co,Fe,_.O, 
Spinels,” J. of Chemical Tech. and Biotech., 1994: 


the other predictors remain fixed, surface area can be 
expected to decrease by roughly 46 units? Explain 
your reasoning. 

c. Does there appear to be a useful linear relationship 
between y and the predictors? 

d. Given that mole contents and calcination tempera- 
ture remain in the model, does the interaction predic- 
tor x, provide useful information about y? State and 
test the appropriate hypotheses using a significance 
level of .01. 

e. The estimated standard deviation of Y when mole 
contents is 2.0 and calcination temperature is 500 is 
Sp = 4.69. Calculate a 95% confidence interval for 
the mean value of surface area under these 
circumstances. 


161-170). A request to the SAS package a fit 44. The accompanying Minitab regression output is based on 
Bo + Bix, + Bx, + Bsx3, where x, = x), (an interac- data that appeared in the article “Application of Design 
tion predictor) yielded the output below. of Experiments for Modeling Surface Roughness in 
x, 6 6 6 6 6 1.0 1.0 Ultrasonic Vibration Turning” (J. of Engr. Manuf., 
x,| 200 250 400 500 600 200 250 2009: 641-652). The response variable is surface rough- 
7 ness (wm), and the independent variables are vibration 
¥ 1906 82.7 SBF 43.2 25.0 127.1 112.3 amplitude (4m), depth of cut (mm), feed rate (mm/rev), 
x, | 1.0 1.0 1.0 2.6 2.6 2.6 2.6 and cutting speed (m/min), respectively. 
x,| 400 500 600 200 250 400 500 a. How many observations were there in the data set? 
y 1196 178 91 53.1 520 434 42.4 b. Interpret the coefficient of multiple derenpmnanan: 
c. Carry out a test of hypotheses to decide if the model 
x,| 26 28 2.8 2.8 2.8 2.8 specifies a useful relationship between the response 
x,| 600 200 250 400 500 600 variable and at least one of the predictors. 
y 131.6 409 379 27.5 273 19.0 d. Interpret the number 18.2602 that appears in the 
; Coef column. 
a. Predict the value of surface area when cobalt content si hme : 
; ; e. At significance level .10, can any single one of the 
is 2.6 and temperature is 250, and calculate the value : re P 
iP dicrecueecnadineteaadl predictors be eliminated from the model provided 
b 7 oe 0 8 ee ce se Ae hia that all of the other predictors are retained? 
. —- ; 4 fe _ a ents eae x on f f. The estimated SD of Y when the values of the four 
Po Ee eee nee ee, re predictors are 10, .5, .25, and 50, respectively, is 
SAS output for Exercise 43 
Dependent Variable: SURFAREA 
Analysis of Variance 
Source DF Sum of Squares Mean Square F Value Prob>F 
Model 3 15223 .52829 5074.50943 18.924 0.0001 
Error 16 4290.53971 268.15873 
Cc Total 19 19514.06800 
Root MSE 6.37555 R-square 0.7801 
Dep Mean 48.06000 Adj R-sq 0.7389 
Give 34.07314 
Parameter Estimates 
Parameter Standard T for HO: Prob 
Variable DF Estimate Error Parameter = 0 > |T| 
INTERCEP 1 185.485740 21.19747682 8.750 0.0001 
COBCON 1 —45.969466 10.61201173 —4.332 0.0005 
TEMP 1 —0.301503 0.05074421 —5.942 0.0001 
CONTEMP 1 0.088801 0.02540388 3.496 0.0030 
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.1178. Calculate both a CI for true average roughness 
and a PI for the roughness of a single specimen, and 
compare these two intervals. 
The regression equation is 
Ra = —0.972 — 0.0312a+ 0.5574 +18.3f£+0.00282v 


Predictor Coef SE Coef ai Pp 
Constant =0,..9723 Oi 3 923. —2.48 0.015 
a =0.03117 0.01864 —1.67 0.099 
d 0.5568 O..31:85: Le75 0.084 
£ 18.2602 0.7536 24.23 0.000 
Vv 0.002822 0.003977 (ore a 0.480 


S=0.822059 R-Sq= 88.6% R-Sq(adj) = 88.0% 


Source DF Ss MS F P 
Regression 4 401.02 100.25 148.35 0.000 
Residual Error 76 51.36 0.68 

Total 80 452.38 


45. The article “Analysis of the Modeling Methodologies 
for Predicting the Strength of Air-Jet Spun Yarns” 
(Textile Res. J., 1997: 39-44) reported on a study carried 
out to relate yarn tenacity (y, in g/tex) to yarn count (x, 
in tex), percentage polyester (x,), first nozzle pressure 
(x;, in kg/cm’), and second nozzle pressure (x,, in kg/ 
cm’). The estimate of the constant term in the corre- 
sponding multiple regression equation was 6.121. The 
estimated coefficients for the four predictors were — .082, 
.113, .256, and —.219, respectively, and the coefficient of 
multiple determination was .946. 

a. Assuming that the sample size was n = 25, state and 
test the appropriate hypotheses to decide whether the 
fitted model specifies a useful linear relationship 
between the dependent variable and at least one of 
the four model predictors. 

Again using n = 25, calculate the value of adjusted R?. 
ec. Calculate a 99% confidence interval for true mean yarn 
tenacity when yarn count is 16.5, yarn contains 50% 
polyester, first nozzle pressure is 3, and second nozzle 
pressure is 5 if the estimated standard deviation of 
predicted tenacity under these circumstances is .350. 


46. A regression analysis carried out to relate y = repair time 
for a water filtration system (hr) to x, = elapsed time 
since the previous service (months) and x, = type of 
repair (1 if electrical and 0 if mechanical) yielded the fol- 
lowing model based on n=12_ observations: 
y = .950 + .400x, + 1.250x,. In addition, SST = 12.72, 
SSE = 2.09, and 5g.= dQ: 

a. Does there appear to be a useful linear relationship 
between repair time and the two model predictors? 
Carry out a test of the appropriate hypotheses using 
a significance level of .05. 

b. Given that elapsed time since the last service remains in 
the model, does type of repair provide useful 
information about repair time? State and test the appro- 
priate hypotheses using a significance level of .01. 

c. Calculate and interpret a 95% CI for B,. 

d. The estimated standard deviation of a prediction for 
repair time when elapsed time is 6 months and the 


47. 
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repair is electrical is .192. Predict repair time under 
these circumstances by calculating a 99% prediction 
interval. Does the interval suggest that the estimated 
model will give an accurate prediction? Why or why 
not? 


Efficient design of certain types of municipal waste 
incinerators requires that information about energy 
content of the waste be available. The authors of the 
article “Modeling the Energy Content of Municipal 
Solid Waste Using Multiple Regression Analysis” (J. 
of the Air and Waste Mgmnt. Assoc., 1996: 650-656) 
kindly provided us with the accompanying data on 
y = energy content (kcal/kg), the three physical com- 
position variables x, = % plastics by weight, x, = % 
paper by weight, and x, = % garbage by weight, and 
the proximate analysis variable x, = % moisture by 
weight for waste specimens obtained from a certain 
region. 


Obs 


Energy 
Plastics Paper Garbage Water Content 


lll tl a 
WO ANANNDNMNBWNrF TOU WAANDUNFWN KE 


WNNNMYNNYNNNNY WY 
CSCOANDUNFSWNK CO 


18.69 15.65 45.01 58.21 947 
19.43 23.51 39.69 46.31 1407 
19.24 24.23 43.16 46.63 1452 
22.64 22.20 35.76 45.85 1553 
16.54 23.56 41.20 55.14 989 
21.44 23.65 35.56 54.24 1162 
19.53 24.45 40.18 47.20 1466 
23.97 19.39 44.11 43.82 1656 
21.45 23.84 35.41 51.01 1254 
20.34 26.50 34.21 49.06 1336 
17.03 23.46 32.45 53.23 1097 
21.03 26.99 38.19 51.78 1266 
20.49 19.87 41.35 46.69 1401 
20.45 23.03 43.59 53.57 1223 
18.81 22.62 42.20 52.98 1216 
18.28 21.87 41.50 47.44 1334 
21.41 20.47 41.20 54.68 1155 
25.11 22.59 37.02 48.74 1453 
21.04 26.27 38.66 53.22 1278 
17.99 28.22 44.18 53.37 1153 
18.73 29.39 34.77 51.06 1225 
18.49 26.58 37.55 50.66 1237 
22.08 24.88 37.07 50.72 1327 
14.28 26.27 35.80 48.24 1229 
17.74 23.61 37.36 49.92 1205 
20.54 26.58 35.40 53.58 1221 
18.25 13.77 51.32 51.38 1138 
19.09 25.62 39.54 50.13 1295 
21.25 20.63 40.72 48.67 1391 
21.62 22.71 36.22 48.19 1372 


Using Minitab to fit a multiple regression model with 
the four aforementioned variables as predictors of energy 
content resulted in the following output: 
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The regression equation is 
enercont = 2245 +28.9 plastics 


+ 7.64 paper + 4.30 garbage 


—37.4 water 


Predictor Coef StDev T Pp 
Constant 2244.9 177 <9 12.62 0.000 
plastics 28.925 2.824 10.24 0.000 
paper 7.644 2.314 3.30 0.003 
garbage 4.297 1.916 2.24 0.034 
water —37.354 1.834 —20.36 0.000 
s = 31.48 R-Sq = 96.4% R-Sq(adj) = 95.8% 


Analysis of Variance 


Source DF Ss MS F P 
Regression 4 664931 166233 167.71 0.000 
Error 25 24779 991. 

Total 29 689710 


48. 


a. Interpret the values of the estimated regression 
coefficients B, and B,. 

b. State and test the appropriate hypotheses to decide 
whether the model fit to the data specifies a useful 
linear relationship between energy content and at 
least one of the four predictors. 

c. Given that % plastics, % paper, and % water remain 
in the model, does % garbage provide useful informa- 
tion about energy content? State and test the appropri- 
ate hypotheses using a significance level of .05. 

d. Use the fact that sy = .7.46 when x, = 20, x, = 25, 
x3 = 40, and x, = 45 to calculate a 95% confidence 
interval for true average energy content under these 
circumstances. Does the resulting interval suggest that 
mean energy content has been precisely estimated? 

e. Use the information given in part (d) to predict 
energy content for a waste sample having the speci- 
fied characteristics, in a way that conveys informa- 
tion about precision and reliability. 


An experiment to investigate the effects of a new technique 
for degumming of silk yarn was described in the article 
“Some Studies in Degumming of Silk with Organic 
Acids” (J. Society of Dyers and Colourists, 1992: 79-86). 
One response variable of interest was y = weight loss (%). 
The experimenters made observations on weight loss for 
various values of three independent variables: x, = tem- 
perature (°C) = 90,100, 110;x, = time of teatment 
(min) = 30, 75, 120; x, = tartaric acid concentration 
(g/L) = 0,8,16. In the regression analyses, the three values 
of each variable were coded as —1, 0, and 1, respec- 
tively, giving the accompanying data (the value yg = 19.3 
was reported, but our value y, = 20.3 results in regres- 
sion output identical to that appearing in the article). 


Obs} 1 2 3 4 5 6 7 8 
x, -1 -1 1 4-1 -1 #1 #1 
x | -l 1 -1 1 0 060 0 0 
x; 0 O O O-1 1 -1 1 
y | 183 22.2 23.0 23.0 33 193 19.3 203 
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Obs 9 10 WW 2 13 14 15 
xy 0 oO oO 0 0 oO Oo 
x, —f. =I | 0 oO 0 
x3 = i: =f -f 0 oO 0 
y 13.1 23.0 209 21.5 22.0 213 22.6 


A multiple regression model with k = 9 predictors—x,, 

Xp, Xq, Xq = XF, X5 = X3, Xp = XZ, X7 = AX, Ng = XXy, 

and x) =x,x;—was fit to the data, resulting in 

Bo = 21.967, B, = 2.8125, B, = 1.2750, By = 3.4375, 

B, = —2.208, B, = 1.867, B, = —4.208, B, = —.975, 

—3.750,B) = —2.325, SSE = 23.379, and 

R? = .938. 

a. Does this model specify a useful relationship? State and 
test the appropriate hypotheses using a significance 
level of .01. 

b. The estimated standard deviation of #, when 
X,;= + =x, =0 (ie., when temperature = 100, 
time = 75, and concentration = 8) is 1.248. Calculate 
a 95% CI for expected weight loss when temperature, 
time, and concentration have the specified values. 

c. Calculate a 95% PI for a single weight-loss value to 
be observed when temperature, time, and concentra- 
tion have values 100, 75, and 8, respectively. 

d. Fitting the model with only x,, x5, and x; as predictors 
gave R? = .456 and SSE = 203.82. Does at least one 
of the second-order predictors provide additional useful 
information? State and test the appropriate hypotheses. 


RD 
oo 
ll 


49. Researchers carried out a study to see how y = ultimate 
deflection (mm), of reinforced ultrahigh toughness 
cementitious composite beams were influenced by x, = 
shear span ratio and x, = splitting tensile strength (MPa), 
resulting in the accompanying data (‘‘Shear Behavior of 
Reinforced Ultrahigh Toughness Cementitious 
Composite Beams without Transverse Reinforce- 
ment,” J. of Materials in Civil Engr., 2012: 1283-1294): 

x Xo y x x2 y 

2.04 3:55 3.11 3.08 3.62 3.36 

2.04 6.07 3.26 3.08 5.89 6.49 

3.06 3.55 3.89 4.11 3.62 2.72 

3.06 6.07 10.25 4.11 5.89 12.48 

4.08 3.55 3.11 2.01 6.18 2.82 

4.08 6.16 13.48 3.02 6.18 5.19 

2.06 3.62 3.94 4.03 6.18 8.04 

2.06 6.16 3.53 
a. Here is Minitab output from fitting the model with 

predictors x,, x,, and x, = x,Xx): 
The regression equation is 
yS 17.38 — 6.37 x1 — 3.662 F171 eK iK2 

Predictor Coef SE Coef T P 

Constant 17.279 7.167 2.41 0.035 

x1 —6.368 2.260 22.82 0.017 

x2 —3.658 1.364 22.68 0.021 

x1x2 1.7067 0.4314 3.96 0.002 


S=1.72225 R-Sq=82.5% R-Sq(adj) =77.8% 


Analysis of Variance 


Source DF Ss MS F P 
Regression i] 154.033 51.344 L731 0.000 
Residual al 32.627 2.966 

Error 
Total 14 186.660 


Carry out a test of model utility. 


b. Should the interaction predictor be retained in the 
model? Carry out a test of hypotheses using a sig- 
nificance level of .05. 

c. The estimated standard deviation of Y when x, =3 
and x, = 6 is sy = .555. Calculate and interpret a 
confidence interval with a 95% confidence level for 
true average deflection under these circumstances. 

d. Using the information in (c), calculate and interpret 
a prediction interval using a 95% confidence level 
for a future value of ultimate deflection to be 
observed when x, = 3 and x, = 6. 


50. When the model Y=£,)+ B,x, + Box, + B3x7 4 
Byx3 + Bx,x, + € is fit to the data of Exercise 49, the 
resulting value of SSE is 28.947. Given that the predic- 
tors x), X,, and x,x, remain in the model, does either of 
the quadratic predictors x} or x3 provide additional 
useful information? State and test the appropriate 
hypotheses. 


51. The article “Optimization of Surface Roughness in 
Drilling Using Vegetable-Based Cutting Oils Devel- 
oped from Sunflower Oil’? (Industrial Lubrication and 
Tribology, 2011: 271-276) gave the following data on x, 
= spindle speed (rpm), x, = feed rate (mm/rev), x, = 
drilling depth (mm), and y = surface roughness (um) 
when a semisynthetic cutting fluid was used: 


x X5 Xs y e* 
320 .10 15 221 = 1:32 
320 12 20 4.14 1.08 
320 14 25 4.69 0.26 
420 .10 20 1.92 —0.40 
420 2 25 2.63 —0.79 
420 14 15 4.34 0.99 
520 10 25 2.03 1.64 
520 12 15 2.34 0.03 
520 14 20 2.67 =1:52 


a. Here is partial Minitab output from fitting the model 
with x,, x), and x; as predictors (authors of the cited 
article used Minitab for this purpose): 


Predictor Coef SE Coef 7 P 
Constant 0.099 1.871 0.05 0.960 
xX, —0.006767 0.002231 —3.03 0.029 
x, 45.67 11.16 4.09 0.009 
X3 0.01333 0.04463 0.30 0.777 


S=0.546589 R-Sq = 83.9% R-Sq(adj) =74.2% 
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Does drilling depth provide useful information about 
roughness given that spindle speed and feed rate 
remain in the model? 

b. Here is Minitab output from fitting the model with 
just x, and x, as predictors (the cited article made no 
mention of this model): 


Predictor Coef SE Coef sg Pp 
Constant 0.365 1.514 0.24 0.817 
My —0.006767 0.002055 —3.29 0.017 
X. 45.67 10.28 4.44 0.004 


2 


S=0.503400 R-Sq=83.6% R-Sq(adj) =78.1% 


Carry out a test of model utility using a = .05. 

c. Calculate and interpret a 95% CI for the population 
regression coefficient on x). 

d. The estimated standard deviation of the predicted Y 
when x, = 400 and x, = .125 is .180. Calculate a 
95% CI for true average roughness under these 
circumstances. 

e. The e* values that appear along with the data are from 
the regression of (b). Investigate model adequacy. 


52. Utilization of sucrose as a carbon source for the produc- 
tion of chemicals is uneconomical. Beet molasses is a 
readily available and low-priced substitute. The article 
“Optimization of the Production of B-Carotene from 
Molasses by Blakeslea Trispora” (J. of Chem. Tech. and 
Biotech., 2002: 933-943) carried out a multiple regression 
analysis to relate the dependent variable y = amount of 
B-carotene (g/dm*) to the three predictors amount of lin- 
eolic acid, amount of kerosene, and amount of antioxidant 
(all g/dm?). 

Obs Linoleic Kerosene Antiox Betacaro 
1 30.00 30.00 10.00 0.7000 
2 30.00 30.00 10.00 0.6300 
3 30.00 30.00 18.41 0.0130 
4 40.00 40.00 5.00 0.0490 
5 30.00 30.00 10.00 0.7000 
6 13.18 30.00 10.00 0.1000 
7 20.00 40.00 5.00 0.0400 
8 20.00 40.00 15.00 0.0065 
9 40.00 20.00 5.00 0.2020 

10 30.00 30.00 10.00 0.6300 
11 30.00 30.00 1.59 0.0400 
12 40.00 20.00 15.00 0.1320 
13 40.00 40.00 15.00 0.1500 
14 30.00 30.00 10.00 0.7000 
15 30.00 46.82 10.00 0.3460 
16 30.00 30.00 10.00 0.6300 
17 30.00 13.18 10.00 0.3970 
18 20.00 20.00 5.00 0.2690 
19 20.00 20.00 15.00 0.0054 
20 46.82 30.00 10.00 0.0640 
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a. Fitting the complete second-order model in the three 
predictors resulted in R? = .987 and adjusted R? = 
.974, whereas fitting the first-order model gave R? = 
.016. What would you conclude about the two models? 
b. For x, =x, = 30,x; = 10, a statistical software 
package reported that y = .66573, sy = .01785, 
based on the complete second-order model. Predict 
the amount of B-carotene that would result from a 
single experimental run with the designated values of 
the independent variables, and do so in a way that 
conveys information about precision and reliability. 
53. Snowpacks contain a wide spectrum of pollutants that 
may represent environmental hazards. The article 
“Atmospheric PAH Deposition: Deposition Velocities 
and Washout Ratios” (J. of Environmental 
Engineering, 2002: 186-195) focused on the deposi- 
tion of polyaromatic hydrocarbons. The authors pro- 
posed a multiple regression model for relating deposi- 
tion over a specified time period (y, in wg/m*) to two 
rather complicated predictors x, (wg-sec/m*) and x, (wg/ 
m7), defined in terms of PAH air concentrations for 
various species, total time, and total amount of precipi- 
tation. Here is data on the species fluoranthene and 
corresponding Minitab output: 
obs x X> flth 
1 92017 .0026900 278.78 
2 51830 .0030000 124.53 
3 17236 .0000196 22.65 
4 15776 .0000360 28.68 
5 33462 .0004960 32.66 
6 243500 .0038900 604.70 
7 67793 .0011200 27.69 
8 23471 .0006400 14.18 
9 13948 0004850 20.64 
10 8824 .0003660 20.60 
11 7699 .0002290 16.61 
12 15791 0014100 15.08 
13 10239 0004100 18.05 
14 43835 .0000960 99.71 
15 49793 .0000896 58.97 
16 40656 .0026000 172.58 
17 50774 .0009530 44.25 
The regression equation is 
flth = —33.5 + 0.00205 x1 + 29836 x2 
Predictor Coef SE Coef T P 
Constant —33.46 14.90 —2.25 0.041 
pal 0.0020548 0.0002945 6.98 0.000 
x2 29836 13654 2.19 0.046 
S=44.28 R-Sq= 92.3% R-Sq(adj) = 91.2% 
Analysis of Variance 
Source DF Ss MS F P 
Regression 2 330989 165495 84.390.000 
Residual Error 14 27454 1961 
Total 16 358443 


Formulate questions and perform appropriate analyses to 
draw conclusions. 
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54. 


Predictor 
Constant 


BHP 


Suppl_1 
Suppl _2 
Lub_1 
Lub_2 


S=1.18413 


Source 
Regression 5 
Residual Error 30 
Total 35 


The use of high-strength steels (HSS) rather than alu- 
minum and magnesium alloys in automotive body struc- 
tures reduces vehicle weight. However, HSS use is still 
problematic because of difficulties with limited form- 
ability, increased springback, difficulties in joining, and 
reduced die life. The article “Experimental Investigation 
of Springback Variation in Forming of High Strength 

Steels” (J. of Manuf. Sci. and Engr., 2008: 1-9) 

included data on y = springback from the wall opening 

angle and x, = blank holder pressure. Three different 
material suppliers and three different lubrication regi- 

mens (no lubrication, lubricant #1, and lubricant #2) 

were also utilized. 

a. What predictors would you use in a model to incor- 
porate supplier and lubrication information in addi- 
tion to BHP? 

b. The accompanying Minitab output resulted from fit- 
ting the model of (a) (the article’s authors also used 
Minitab; amusingly, they employed a significance 
level of .06 in various tests of hypotheses). Does 
there appear to be a useful relationship between the 
response variable and at least one of the predictors? 
Carry out a formal test of hypotheses. 

c. When BHP is 1000, material is from supplier 1, 
and no lubrication is used, sy = .524. Calculate a 
95% PI for the spingback that would result from 
making an additional observation under these con- 
ditions. 

d. From the output, it appears that lubrication regimen 
may not be providing useful information. A regres- 
sion with the corresponding predictors removed 
resulted in SSE = 48.426. What is the coefficient of 
multiple determination for this model, and what 
would you conclude about the importance of the 
lubrication regimen? 

e. A model with predictors for BHP, supplier, and lubri- 
cation regimen, as well as predictors for interactions 
between BHP and both supplier and lubrication regi- 
ment, resulted in SSE = 28.216 and R? = .849. Does 
this model appear to improve on the model with just 
BHP and predictors for supplier? 


Coef 
21.5322 
- 0033680 
—-1.7181 
—1.4840 
—0.3036 

0.8931 


SE Coef T Pp 
0.6782 31.275 
0.0003919 
0.5977 
0.6010 
0.5754 
0.5779 1.55 


oooo°c;o 


R-Sq=77.5%  R-Sq(adj) = 73.8% 


DF Ss MS F P 
144.915 28.983 20.67 0.000 
42.065 1.402 

186.980 
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13.5 Other Issues in Multiple Regression 


In this section, we touch upon a number of issues that may arise when a multiple 
regression analysis is carried out. Consult the chapter references for a more exten- 
sive treatment of any particular topic. 


Transformations 


Sometimes, theoretical considerations suggest a nonlinear relation between a 
dependent variable and two or more independent variables, whereas on other occa- 
sions diagnostic plots indicate that some type of nonlinear function should be used. 
Frequently a transformation will linearize the model. 


EXAMPLE 13.18 Natural single crystal diamond has been widely used in ultraprecision machin- 
ing. However, its application to the cutting of ferrous metals has been problematic 
due to significant tool wear. The article “Investigation on Frictional Wear of 
Single Crystal Diamond Against Ferrous Metals” (Intl. J. of Refractory Metals 
and Hard Materials, 2013: 174-179) presented the accompanying data on x, = 
mechanical force (N), x, = sliding velocity (m/s), x, = carbon content (%), and y = 
graphitized degree, a measure of diamond wear. 


Obs x Xy X y 
1 10 84 07 18 
2 10 1.05 27 19 
3 10 1.26 45 22 
4 20 84 27 21 
3 20 1.05 AS 24 
6 20 1.26 07 28 
7 30 .84 AS .26 
8 30 1.05 07 30 
9 30 1.26 27 33 


The investigators proposed and fit the multiplicative power regression model 
Y = ax*ix$:x8se, Taking the natural logarithm of both sides of this equation gives 


In(Y) = In(a) + B, In(x,) + BoIn(x,) + ByIn@x,) + Ine) (13.21) 


which is our general additive multiple regression equation with the dependent varia- 
ble being the natural log of graphitized degree and predictors In(x,), In(x,), and In(x). 
Presuming that e€ in the original model equation has a lognormal distribution, the 
random error in our transformed model will be normally distributed. The plausibility 
of this assumption can be checked with a normal probability plot of the standardized 
residuals resulting from fitting the transformed model. 

Table 13.4 shows Minitab output from fitting (13.21). The R? value is quite 
impressive—about 98% of the observed variation in In(y) can be attributed to 
the model relationship—and adjusted R? is only slightly smaller than R? itself. 
Furthermore, the P-value for the model utility F test is .000 (the area under the F; ; 
curve to the right of 81.16), implying a useful relationship between In(y) and at least 
one of the three predictors. The point estimates of B,, B,, and B, are .36557, .59366, 
and —.02074, respectively. The point estimate of In(@) is —2.53727, so the point 
estimate of a itself is e777?” = .079082. The estimated original regression function 
is then .079x76%x3%4x; 71; this appears in the cited article. 
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Table 13.4 Minitab output for the transformed regression in Example 13.18 


The regression equation is 


In(y) = — 2.54 + 0.366 In(x1) + 0.594 1In(x2) — 0.0207 I1n(x3) 
Predictor Coef SE Coef T P 
Constant =2.53727 0.08413 =3:0:.. 16: 0.000 

rn (381°) 0.36557 0.02734 1:33:37 0.000 

1n (x2) 0.59366 0.07480 7.94 0.001 

In (x3) —0.02074 0.01580 —1.31 0.246 

S = 0.0372066 R-Sq = 98.0% R-Sq(adj) = 96.8% 


Analysis of Variance 


Source DF Ss MS F P 
Regression 3 0.33706 0.11235 81.16 0.000 
Residual Error 5 0.00692 0.00138 

Total 8 0.34399 


Predicted Values for New Observations 
New Obs Fit SE Fit 95% CI 95% PI 
1 -1.4134 0.0133 (-1.4477, —-1.3791) (-1.5150, —1.3118) 


A point prediction of the value of graphitized degree when force = 20, velocity = 
1, and carbon content = .25 requires that we first obtain a point prediction of In(Y) 
by substituting In(20), In(O), and In(.25) into the estimated regression equation in 
Table 13.4. The result is In(v) = -1.4134, which appears in the last line of Minitab 
output. Then ) = e !4!34 = 243. Similarly, the output gives a 95% PI for In(Y), so 
a PI for Y itself is (e719, e~ 13118) = (.220, .269). 

The normal probability plot of Figure 13.20 exhibits a substantial linear 
pattern, validating the normality assumption for In(e). And the plot of standard- 
ized residuals versus predicted values [of In(y)] does not show any pattern other 
than pure randomness, indicating no violation of model assumptions. However, 
looking back at Table 13.4, the P-value for testing Hp: B; = 0 is .246. Thus it 
appears that as long as In(x,) and In(x,) remain in the model, there is no useful 
information about the response variable contained in the natural log of carbon 
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Figure 13.20 Standardized residual plot and normal probability plot for Example 13.18 
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content. Deleting that predictor and refitting gives R? = .973 and a model util- 
ity F ratio of 107.87. The estimates of 8, and B, are almost identical to those 
for the three-predictor model. Also, the multiple exponential regression model 
Y = ae? + ®2%e [for which In(Y) is regressed against x, and x, rather than against 
In(x,) and In(x,)] fits the data about as well as does the power model. None of this 
was mentioned in the cited article. a 


The logistic regression model was introduced in Section 13.2 to relate a 
dichotomous variable y to a single predictor. This model can be extended in an obvi- 
ous way to incorporate more than one predictor. The probability of success p is now 
a function of the predictors x), x,,...,.X4 

ePot Bits + Bx, 


~ LT + ebot Bint + Bix 


PO&4,+-+5 X,) 
Simple algebra yields an expression for the odds: 


D(Xq, «+5 Xp) 
Le = PO 305 Rp) 


= ett Bim to + Bx, 


The interpretation of B; (i = 1, ..., k) is analogous to the interpretation for B, given 
in the logit function containing only a single predictor x. That is, the following argu- 
ment shows that the odds change by the multiplicative factor e%' when x; increases 
by | unit and all other predictors remain fixed. 


Din wcistp PE [pea Xp) 


— et + Bix, + Bt D+ + By 


Lp; ose tp PA eG X) 
= ett Bi to Bix to + Bex tB; 

D(Xq, +15 Xp) 
D> ps 2655.2) 


efi 


Again, statistical software must be used to estimate parameters, calculate relevant 
standard deviations, and provide other inferential information. 


EXAMPLE 13.19 Data was obtained from 189 women who gave birth during a particular period at the 
Bayside Medical Center in Springfield, MA, in order to identify factors associated 
with low birth weight. The accompanying Minitab output resulted from a logistic 
regression in which the dependent variable indicated whether (1) or not (0) a child 
had low birth weight (<2500 g), and predictors were weight of the mother at her last 
menstrual period, age of the mother, and an indicator variable for whether (1) or not 
(0) the mother had smoked during pregnancy. 


Logistic Regression Table 


Odds 95% cI 
Predictor Coef SE Coef Z P Ratio Lower Upper 
Constant 2.06239 1.09516 1.88 0.060 
wt —0.01701 0.00686 ~—2.48 0.013 0.98 0.97 1.00 
Age —0.04478 0.03391 —1.32 0.187 0.96 0.89 1.02 
Smoke 0.65480 0.33297 1.97 0.049 1.92 1.00 3.70 


It appears that age is not an important predictor of LBW, provided that the two other 
predictors are retained. The other two predictors do appear to be informative. The point 
estimate of the odds ratio associated with smoking status is 1.92 [ratio of the odds of 
LBW for a smoker to the odds for a nonsmoker, where odds = P(Y = 1)/P(Y = 0)]; 
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at the 95% confidence level, the odds of a low-birth-weight child could be as much 
as 3.7 times higher for a smoker what it is for a nonsmoker. a 


Please see one of the chapter references for more information on logistic 
regression, including methods for assessing model effectiveness and adequacy. 


Standardizing Variables 


In Section 13.3, we considered transforming x to x’ = x — x before fitting a polyno- 
mial. For multiple regression, especially when values of variables are large in mag- 
nitude, it is advantageous to carry this coding one step further. Let x; and s; be the 
sample average and sample standard deviation of the x,;’s (j = 1,..., n). Now code 
each variable x; by x/ = (x; — x,)/s;. The coded variable x/ simply reexpresses any 
x, value in units of standard deviation above or below the mean. Thus if x; = 100 and 
s; = 20, x; = 130 becomes x/ = 1.5, because 130 is 1.5, standard deviations above 
the mean of the values of x;. For example, the coded full second-order model with 
two independent variables has regression function 


os. Be = > = ee 2 
E(Y) = By + a(* ” *) + p(” 7 *) + o{* : “1 


eae) ea) 
1 2 


= Bo + Byxy + Boxy + Byx3 + Byxg + Bsxs 


So 


The benefits of coding are (1) increased numerical accuracy in all computations and 
(2) more accurate estimation than for the parameters of the uncoded model, because 
the individual parameters of the coded model characterize the behavior of the regres- 
sion function near the center of the data rather than near the origin. 


EXAMPLE 13.20 The article ““The Value and the Limitations of High-Speed Turbo-Exhausters 
for the Removal of Tar-Fog from Carburetted Water-Gas” (J. of the 
Chemical Industry Society, 1946: 166-168) presents the data (in Table 13.5) on 
y = tar content (grains/100 ft*) of a gas stream as a function of x, = rotor speed (rpm) 
and x, = gas inlet temperature (°F). The data is also considered in the article “Some 
Aspects of Nonorthogonal Data Analysis” (J. of Quality Tech. 1973: 67-79), 
which suggests using the coded model described previously. 


The means and standard deviations are x, = 2991.13, s, = 387.81, x, = 58.468, 
and s, = 6.944, so x = (x, — 2991.13)/387.81 and x, = (x, — 58.468)/6.944. 
With x’, = (x')?, x/, = (x), x5 = x', + x}, fitting the full second-order model yielded 
By = 40.2660, B, = —13.4041, B, = 10.2553, B, = 2.3313, B, = —2.3405, and 
Bs = 2.5978. The estimated regression equation is then 


} = 40.27 — 13.40x', + 10.26x', + 2.33x', — 2.34x', + 2.602’; 
Thus if x, = 3200 and x, = 57.0,x, = .539, x, = —.211, x, = (.539)? = .2901, 
x, = (—.211)? = .0447, and x; = (.539)(—.211) = —.1139, so 
$ = 40.27 — (13.40)(.539) + (10.26)(—.211) + (2.33)(.2901) 
—(2.34)(.0447) + (2.60)(—.1139) = 31.16 a 
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Table 13.5 Data for Example 13.20 


Run y xy xX xy x,’ 

1 60.0 2400 54.5 — 1.52428 —.57145 

2 61.0 2450 56.0 — 1.39535 —.35543 

3 65.0 2450 58.5 — 1.39535 .00461 

4 30.5 2500 43.0 — 1.26642 —2.22763 

5 63.5 2500 58.0 — 1.26642 —.06740 

6 65.0 2500 59.0 — 1.26642 .07662 

7 44.0 2700 52.5 —.75070 — 85948 

8 52.0 2700 65.5 —.75070 1.01272 

9 54.5 2700 68.0 —.75070 1.37276 
10 30.0 2750 45.0 —.62177 — 1.93960 
11 26.0 2775 45.5 =195)31 — 1.86759 
12 23.0 2800 48.0 — 49284 — 1.50755 
13 54.0 2800 63.0 —.49284 .65268 
14 36.0 2900 58.5 —.23499 .00461 
15 33.5 2900 64.5 —.23499 .86870 
16 57.0 3000 66.0 .02287 1.08472 
17 33:5 3075 57.0 .21627 —.21141 
18 34.0 3100 57.5 .28073 —.13941 
19 44.0 3150 64.0 40966 .79669 
20 33.0 3200 57.0 53859 —.21141 
21 39.0 3200 64.0 53859 -79669 
22 53.0 3200 69.0 53859 1.51677 
23 38.5 3225 68.0 .60305 1.37276 
24 39.5 3250 62.0 .66752 50866 
25 36.0 3250 64.5 .66752 .86870 
26 8.5 3250 48.0 .66752 —1.50755 
27 30.0 3500 60.0 1.31216 .22063 
28 29.0 3500 59.0 1.31216 .07662 
29 26.5 3500 58.0 1.31216 —.06740 
30 24.5 3600 58.0 1.57002 —.06740 
31 26.5 3900 61.0 2.34360 36465 


Variable Selection 


Suppose an experimenter has obtained data on a response variable y as well as on p 
candidate predictors x,,...,x,. How can a best (in some sense) model involving a 
subset of these predictors be selected? Recall that as predictors are added one by one 
into a model, SSE cannot increase (a larger model cannot explain less variation than a 
smaller one) and will usually decrease, albeit perhaps by a small amount. So there is 
no mystery as to which model gives the largest R? value—it must be the one containing 
all p predictors. What we’d really like is a model involving relatively few predictors 
that is easy to interpret and use yet explains a relatively large amount of observed y 
variation. 

For any fixed number of predictors (e.g., 5), it is reasonable to identify the 
best model of that size as the one with the largest R* value—equivalently, the small- 
est value of SSE. The more difficult issue concerns selection of a criterion that will 
allow for comparison of models of different sizes. Let’s use a subscript k to denote a 
quantity computed from a model containing k predictors (e.g., SSE,). Three different 
criteria, each one a simple function of SSE,, are popular. 
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1. R2, the coefficient of multiple determination for a k-predictor model. Because 
R? will virtually always increase as k does (and can never decrease), we are not 
interested in the k that maximizes R?. Instead, we wish to identify a small & for 
which R? is nearly as large as R? for all predictors in the model. 


2. MSE, = SSE,/(n — k — 1), the mean squared error for a k-predictor model. This 
is often used in place of R?, because although R? never decreases with increasing 
k, a small decrease in SSE, obtained with one extra predictor can be more than 
offset by a decrease of | in the denominator of MSE,. The objective is then to 
find the model having minimum MSE,. Since adjusted Rz = 1 — MSE,/MST, 
where MST = SST/(n — 1) is constant in k, examination of adjusted R? is 
equivalent to consideration of MSE,. 


3. The rationale for the third criterion, C,, is more difficult to understand, but the 
criterion is widely used by data analysts. Suppose the true regression model is 
specified by m predictors—that is, 


Y=fo+ Bix,+ + + B,x,, + € Vie) = 0° 


mm 


so that 
E(Y) = Bo an Bix, a ae BaXm 


Consider fitting a model by using a subset of k of these m predictors; for sim- 
plicity, suppose we use x,,X,,...,x,. Then by solving the system of normal 


equations, estimates Bos Disses B, are obtained (but not, of course, estimates of 
any P’s corresponding to predictors not in the fitted model). The true expected 
value E(Y) can then be estimated by Y = Bo + Bix, a By Xp Now consider 
the normalized expected total error of estimation 


n z = 9 
Su, E(Y,)] E(SSE,) 
C,= = 5 PA De (13.21) 
oO 


eo 


The second equality in (13.21) must be taken on faith because it requires a tricky 
expected-value argument. A particular subset is then appealing if its I’, value is 
small. Unfortunately, though, E(SSE,) and o” are not known. To remedy this, let 
s? denote the estimate of a? based on the model that includes all predictors for 
which data is available, and define 


E 42(k+1)—n 


A desirable model is then specified by a subset of predictors for which C;, is small. 


The total number of models that can be created from predictors in the 
candidate pool is 2? (because each predictor can be included in or left out of any 
particular model—one of these is the model that contains no predictors). If p = 5, 
then it would not be too tedious to examine all possible regression models involv- 
ing these predictors using any good statistical software package. But the computa- 
tional effort required to fit all possible models becomes prohibitive as the size of 
the candidate pool increases. Several software packages have incorporated algo- 
rithms which will sift through models of various sizes in order to identify the best 
one or more models of each particular size. Minitab, for example, will do this for 
p = 31 and allows the user to specify the number of models of each size (1, 2, 3, 4, 
or 5) that will be identified as having best criterion values. You might wonder why 
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we’d want to go beyond the best single model of each size. The answer is that the 
2nd or 3rd best model may be just about as good as the best model and easier to 
interpret and use, or may be more satisfactory from a model-adequacy perspective. 
For example, suppose the candidate pool includes all predictors from a full quad- 
ratic model based on five independent variables. Then the best 3-predictor model 
might have predictors x,, xj, and x,x,;, whereas the second-best such model could 
be the one with predictors x,, x;, and x,x3. 


EXAMPLE 13.21 The review article by Ron Hocking listed in the chapter bibliography reports on an 
analysis of data taken from the 1974 issues of Motor Trend magazine. The depend- 
ent variable y was gas mileage, there were n = 32 observations, and the predictors 
for which data was obtained were x, = engine shape (1 = straight and 0 = V), 
x, = number of cylinders, x, = transmission type(1 = manual and 0 = auto), x, = 
number of transmission speeds, x; = engine size, x, = horsepower, x, = number of 
carburetor barrels, x, = final drive ratio, x, = weight, and x,, = quarter-mile time. 
In Table 13.6, we present summary information from the analysis. The table 
describes for each k the subset having minimum SSE,; reading down the variables 
column indicates which variable is added in going from k tok + 1 (going from k = 2 
to k = 3, both x, and x,, are added, and x, is deleted). Figure 13.21 contains plots of 
R?, adjusted Rz, and C, against k; these plots are an important visual aid in selecting 
a subset. The estimate of o is s* = 6.24, which is MSE). A simple model that 
rates highly according to all criteria is the one containing predictors x3, x), and Xjo. 


Table 13.6 Best Subsets for Gas Mileage Data of Example 13.21 


k = Number of 
Predictors Variables SSE, R? Adjusted Rj C, 
1 9 247.2 .756 748 11.6 
2 2 169.7 833 821 1.2 
3 3, 10, -2 150.4 852 836 Al 
4 6 142.3 860 839 8 
5 a 136.2 .866 840 1.8 
6 8 133.3 .869 837 3.4 
7 + 132.0 870 .832 D2 
8 Z 131.3 871 .826 71 
9 1 131.1 871 818 9.0 
10 2 131.0 871 .809 11.0 
Ri Adj. RZ Cy 
4 4A 
90 5 124 
e 
85 4 te me . 
4 e ory °. : 3 e 
80 4 i e 
15 e 44 _ 
704 24 ‘ . Ps 
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Figure 13.21 A? and C, plots for the gas mileage data a 
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Generally speaking, when a subset of k predictors (k < m) is used to fit 
a model, the estimators Bos Babs By will be biased for Bo, B,,..., B, and Y will 
also be a biased estimator for the true E(Y) (all this because m — k predictors are 
missing from the fitted model). However, as measured by the total normalized 
expected error [’,, estimates based on a subset can provide more precision than 
would be obtained using all possible predictors; essentially, this greater precision is 
obtained at the price of introducing a bias in the estimators. A value of k for which 
C, ~ k + | indicates that the bias associated with this k-predictor model would 
be small. 


EXAMPLE 13.22 The bond shear strength data introduced in Example 13.12 contains values of four 
different independent variables x,—x,. We found that the model with only these four 
variables as predictors was useful, and there is no compelling reason to consider the 
inclusion of second-order predictors. Figure 13.22 is the Minitab output that results 
from a request to identify the two best models of each given size. 

The best two-predictor model, with predictors power and temperature, seems 
to be a very good choice on all counts: R? is significantly higher than for models with 
fewer predictors yet almost as large as for any larger models, adjusted R* is almost 
at its maximum for this data, and C, is small and close to 2 + 1 = 3. 


Response is strength E p 
fe) ° t t 
r w e i 
Adj. c e m m 
Vars R-sq R-sq C-p s e Xr p e 
1 oy deers B62 142.0 5.9289 x 
1 10.8 Led 51.9 8.6045 x 
2 68.5 66.2 3,5) 5.2070 x x 
2 59.4 56.4 11.5 5.9136 x x 
3 70.2 66.8 4.0 5.1590 x x x 
3 69.7 66.2 4.5 5.2078 x x x 
4 71.4 66.8 5.0 5.1580 x x x x 
Figure 13.22 Output from Minitab’s Best Subsets option =I 


Stepwise Regression When the number of predictors is too large to allow for 
explicit or implicit examination of all possible subsets, several alternative selection 
procedures will generally identify good models. The simplest such procedure is the 
backward elimination (BE) method. This method starts with the model in which 
all predictors under consideration are used. Let the set of all such predictors be 
X,-+.,X,,- Then each tf ratio By/ sali = |,...,m) appropriate for testing Hy: B; = 0 
versus H,: 6; # 0 is examined. If the ¢ ratio with the smallest absolute value is less 
than a prespecified constant ¢,,,,, that is, if 


out? 


then the predictor corresponding to the smallest ratio is eliminated from the model. 
The reduced model is now fit, the m — | t ratios are again examined, and another 
predictor is eliminated if it corresponds to the smallest absolute ¢ ratio smaller 
than ¢,,,. In this way, the algorithm continues until, at some stage, all absolute 
t ratios are at least t,,,,. The model used is the one containing all predictors that were 
not eliminated. The value f,,, = 2 is often recommended since most f,; values are 
near 2. Some computer packages focus on P-values rather than f ratios. 


out 
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EXAMPLE 13.23 For the coded full quadratic model in which y = tar content, the five potential 
(Example 13.20 predictors are x‘, x5,x = x3 x), = x37, and x, = x',x',(so m = 5). Without specifying 
continued) toy» the predictor with the smallest absolute ¢ ratio (asterisked) was eliminated at 


each stage, resulting in the sequence of models shown in Table 13.7. 


Table 13.7 Backward Elimination Results for the Data of Example 13.20 


|t - ratio| 
Step Predictors 1 2 3 4 5 
1 1, 2,3, 4,5 16.0 10.8 2.9 2.8 1.8* 
2 1, 2, 3,4 15.4 10.2 3.7 2.0* — 
3 1, 2,.3 14.5 12.2 4.3% — —_— 
4 1,2 10.9 9.1 — — —_ 
5 1 4.4% = = = = 
Using ¢,,, = 2, the resulting model would be based on x’, x5, and x3, since at Step 3 


no predictor could be eliminated. It can be verified that each subset is actually the best 
subset of its size, though this is by no means always the case. a 


An alternative to the BE procedure is forward selection (FS). FS starts 
with no predictors in the model and considers fitting in turn the model with only 
xX,, only x,,..., and finally only x,,. The variable that, when fit, yields the largest 
absolute ¢ ratio enters the model provided that the ratio exceeds the specified con- 
stant f,,. Suppose x, enters the model. Then models with (x,, x4), (% 1, ¥3),---(y, X, 
are considered in turn. The largest | B,/ sa |( j = 2,...,m) then specifies the entering 
predictor provided that this maximum also exceeds ¢,,. This continues until at 
some step no absolute ¢ ratios exceed ¢,,. The entered predictors then specify the 
model. The value ¢,, = 2 is often used for the same reason that ¢,,, = 2 is used in 
BE. For the tarcontent data, FS resulted in the sequence of models given in Steps 
5, 4,..., | in Table 13.7 and thus is in agreement with BE. This will not always 
be the case. 

The stepwise procedure most widely used is a combination of FS and BE, 
denoted by FB. This procedure starts as does forward selection, by adding variables 
to the model, but after each addition it examines those variables previously entered 
to see whether any is a candidate for elimination. For example, if there are eight 
predictors under consideration and the current set consists of x,, x;, x5, and x, with 
x; having just been added, the f ratios Bo/'sp.s Ba/83.s and Be/sa, are examined. If the 
smallest absolute ratio is less than ¢,,,, then the corresponding variable is eliminated 
from the model (some software packages base decisions on f = f’). The idea behind 
FB is that, with forward selection, a single variable may be more strongly related to 
y than to either of two or more other variables individually, but the combination of 
these variables may make the single variable subsequently redundant. This actually 
happened with the gas-mileage data discussed in Example 13.21, with x, entering 
and subsequently leaving the model. 

Although in most situations these automatic selection procedures will 
identify a good model, there is no guarantee that the best or even a nearly best model 
will result. Close scrutiny should be given to data sets for which there appear to be 
strong relationships among some of the potential predictors; we will say more about 
this shortly. 
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Identification of Influential Observations 


In simple linear regression, it is easy to spot an observation whose x value is 
much larger or much smaller than other x values in the sample. Such an observa- 
tion may have a great impact on the estimated regression equation (whether it 
actually does depends on how far the point (x, y) falls from the line determined 
by the other points in the scatterplot). In multiple regression, it is also desirable 
to know whether the values of the predictors for a particular observation are such 
that it has the potential for exerting great influence on the estimated equation. 
One method for identifying potentially influential observations relies on the fact 
that because each Bi is a linear function of y,, y>, ..., y,, each predicted y value 
of the form } = Bo =P Bix, + cag BX; is also a finear function of the y,’s. In 
particular, the predicted values Ne ee ents to sample observations can be 
written as follows: 


dy = yyy + Aypyy + + Yn 
= ou + Mave t Bae Mann 


> hea a hid Gee a 


n 


Each coefficient /, is a function only of the x;’s in the sample and not of the y;’s. It 
can be shown that hy = hj, and thatO Sh, S 1. 

Let’s focus on ‘the “diagonal” coefficients Ay, hgp,--+5 Ay». The coefficient h, is 
the weight given to y, in computing the corresponding predicted value Yj. This Bans 
tity can also be ete ced as a measure of the distance between the point (x,,,..., x i) 
in k-dimensional space and the center of the data (x,.,..., x,). It is therefore natural 
to characterize an observation whose h,, is relatively jarge as one that has potentially 
large influence. Unless there is a perfect linear relationship among the k predictors, 
xi'_,h, =k + 1, so the average of the h,/’s is (k + 1)/n. Some statisticians suggest 
that if hy > 2(k + 1) /n, the jth dbservatiot be cited as being potentially influential; 
others ie 3(k + 1)/n as the dividing line. 


EXAMPLE 13.24 The accompanying data appeared in the article “Testing for the Inclusion of 
Variables in Linear Regression by a Randomization Technique” (Technometrics, 
1966: 695-699) and was reanalyzed in Hoaglin and Welsch, “‘The Hat Matrix 
in Regression and ANOVA” (Amer. Statistician, 1978: 17-23). The h,’s (with 
elements below the diagonal omitted by symmetry) follow the data. 


Beam Number Specific Gravity (x,) Moisture Content (x,) Strength (y) 
1 499 11.1 11.14 
2 558 8.9 12.74 
3 .604 8.8 13.13 
4 441 8.9 11.51 
o 550 8.8 12.38 
6 528 9.9 12.60 
7 418 10.7 11.13 
8 480 10.5 11.70 
9 406 10.5 11.02 

10 467 10.7 11.41 
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1 2 3 4 5 6 7 8 9 10 
1 418 -—.002 .079 -—.274 —.046 181 128 222 050 242 
2 242.292 136 243 128 —.041 .033,  —.035 004 
3 417° —.019 273 187 —.126 044 —.153 004 
4 .604 197 —.038 .168 —.022 275 —.028 
5 252 111 —.030 019 —.010 —.010 
6 148 042 117 012 111 
a .262 145 Qt 174 
8 154 120 .168 
9 ple) 148 
10 187 


Here k = 2, so (k + 1)/n = 3/10 = .3; since hy, = .604 > 2(.3), the fourth data 
point is identified as potentially influential. a 


Another technique for assessing the influence of the jth observation that takes 
into account y, as well as the predictor values involves deleting the jth observation 
from the data set and performing a regression based on the remaining observations. 
If the estimated coefficients from the “deleted observation” regression differ greatly 
from the estimates based on the full data, the jth observation has clearly had a sub- 
stantial impact on the fit. One way to judge whether estimated coefficients change 
greatly is to express each change relative to the estimated standard deviation of the 
coefficient: 


(8, before deletion) — (8, after deletion) change in B, 
56 SS 
There exist efficient computational formulas that allow all this information to be 


obtained from the “no-deletion” regression, so that the additional n regressions are 
unnecessary. 


EXAMPLE 13.25 Consider separately deleting observations | and 6, whose residuals are the largest, 
(Example 13.24 and observation 4, where h;; is large. Table 13.8 contains the relevant information. 
continued) - 


Table 13.8 Changes in Estimated Coefficients for Example 13.25 


Change When Point j Is Deleted 


Parameter No-Deletions Estimates Estimated SD j=l j=4 jJ=6 
Bo 10.302 1.896 2.710 —2.109 —.642 
B, 8.495 1.784 —1.772 1.695 748 
B .2663 1273 —.1932 1242 .0329 
e;: —3.25 —.96 2.20 
h..: 418 .604 148 


For deletion of both point | and point 4, the change in each estimate is in the range 
1—-1.5 standard deviations, which is reasonably substantial (this does not tell us what 
would happen if both points were simultaneously omitted). For point 6, however, 
the change is roughly .25 standard deviation. Thus points | and 4, but not 6, might 
well be omitted in calculating a regression equation. a 
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Multicollinearity 


In many multiple regression data sets, the predictors x, x,,..., x, are highly interde- 
pendent. Consider the usual model 


Y=By+ Byx, +--+ Bx, + € 


with data (x,,,..., x, y)) (7 = 1,...,) available for fitting. Suppose the principle 
of least squares is used to regress x, on the other predictors X,,...,.X;—4,Xjpyse0es Xp 
resulting in 


Xj = Ag + AX, Forte FG Xj HF Gy Xin, Ft Fax, 


It can then be shown that 


VB) = = —_ (13.22) 


When the sample x, values can be predicted very well from the other predictor values, 
the denominator of (13.22) will be small, so vib ;) will be quite large. If this is the case 
for at least one predictor, the data is said to exhibit multicollinearity. Multicollinearity 
is often suggested by a regression computer output in which R? is large but some of 
the f ratios B |S g, are small for predictors that, based on prior information and intuition, 
seem important. Another clue to the presence of multicollinearity lies in a 6, value that 
has the opposite sign from that which intuition would suggest, indicating that another 
predictor or collection of predictors is serving as a “proxy” for x;. 

An assessment of the extent of multicollinearity can be obtained by regressing 
each predictor in turn on the remaining k — 1 predictors. Let R? denote the value of R? 
in the regression with dependent variable x, and predictors X,,..., Xj ),Xj;4,)-++»X, It 
has been suggested that severe multicollinearity is present if R? > .9 for any i. Some 
statistical software packages will refuse to include a predictor in the model when its R? 
value is quite close to 1. 

There is no consensus among statisticians as to what remedies are appropriate 
when severe multicollinearity is present. One possibility involves continuing to use a 
model that includes all the predictors but estimating parameters by using something 
other than least squares. Consult a chapter reference for more details. 


EXERCISES Section 13.5 (55-64) 


55. The article “The Influence of Honing Process specific pressure of pre-honing process (N/mm7’), x; = 
Parameters on Surface Quality, Productivity, Cutting specific pressure of finishing honing process, and y = 
Angle, and Coefficient of Friction” (Industrial productivity in the honing process (mm/*/s for a particu- 
Lubrication and Tribology, 2012: 77-83) included the lar tool; productivity is the volume of the material cut in 
following data on x, = cutting speed (m/s), x, = a second. 

x} Xy X3 y x Xy x3 y 
0.93 1.00 0.20 32.95 0.93 1.40 0.50 33.67 
1.11 1.00 0.20 38.72 1.11 1.40 0.50 38.72 
0.93 1.00 0.50 35.20 1.02 1.18 0.31 35.20 
1.11 1.00 0.50 38.72 1.02 1.18 0.31 33.67 
0.93 1.40 0.20 32.27 1.02 1.18 0.31 36.02 
1.11 1.40 0.20 39.71 1.02 1.18 0.31 32.27 
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56. 


S = 0.0462680 R-Sq = 70.3% 


a. The article proposed a multivariate power model 


Y = ax#ixfex8e. The implied linear regression model 
involves regressing In(y) against the three predictors 
In(x,), In@x,), and In(x,). Partial Minitab output from 
fitting this latter model is as follows (the corresponding 
estimated power regression function appeared in the 
cited article). 


Predictor Coef SE Coef 7 Pp 
Constant 3.58797 0.04909 73.10 0.000 
Inx1 0.8439 0.1952 4.32 0.003 
1Inx2 20.0280 0.1027 20.27 0.792 
1nx3 0.02449 0.03768 0.65 0.534 
S = 0.048848 R-Sq = 70.6% R-Sq(adj) = 
59.5% 


Carry out the model utility test at significance level .05. 


b. The large P-value corresponding to the f ratio for In(x,) 


suggests that this predictor can be eliminated from the 
model. Doing so and refitting yields the following 
Minitab output. 


Predictor Coef SE Coef vi P 
Constant 3.58329 0.04355 82.28 0.000 
iInx1 0.8440 0.1849 4.57 0.001 
1nx3 0.02449 0.03569 0.69 0.510 


R-Sq(adj) = 63.7% 


Given that In(x,) remains in the model, should In(x;) 
be retained? 

c. Fit the simple linear regression model implied by 
your conclusion in (b) to the transformed data, and 
carry out a test of model utility. 

d. The standardized residuals from the fit referred to in 
(c) are .03, .33. 1.69, .33, —.49, .96, .57, .33, —,25, 
—1.28, .29, —2.26. Plot these against In(x,). What 
does the pattern suggest? 

e. Fitting a quadratic regression model to relate In(y) to 
In(x,) gave the following Minitab output. Carry out a 
test of model utility at significance level .05 (the pat- 
tern in residual plots is satisfactory). Then use the 
fact that 5), = .0178 [Y’ = In(Y)] when x, = 1 to 
obtain a 95% prediction interval for productivity. 


Predictor Coeff SE Coef 7 Pp 
Constant 3.51879 0.01775 198.22 0.000 
Inx1 0.6231 0.1683 3.70 0.005 
inx1 sqd 7.240 2.834 2.55 0.031 


S=0.0361358 R-Sq = 81.9% R-Sq(adj) =77.9% 

In an experiment to study factors influencing wood spe- 
cific gravity (“Anatomical Factors Influencing Wood 
Specific Gravity of Slash Pines and the Implications 
for the Development of a High-Quality Pulpwood,” 
TAPPI, 1964: 401-404), a sample of 20 mature wood 
samples was obtained, and measurements were taken on 
the number of fibers/mm? in springwood (x,), number of 
fibers/mm? in summerwood (x), % springwood (x), 


57. 
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light absorption in springwood (x,), and light absorption 

in summerwood (x;). 

a. Fitting the regression function py.. «¢.x,x, = Bo + 
Bx, + ++ + Bx, resulted in R? = .769. Does the 
data indicate that there is a linear relationship 
between specific gravity and at least one of the pre- 
dictors? Test using a = .01. 

b. When x, is dropped from the model, the value of R? 
remains at .769. Compute adjusted R? for both the 
full model and the model with x, deleted. 

c. When x,, x, and x, are all deleted, the resulting value 
of R? is .654. The total sum of squares is SST = 
.0196610. Does the data suggest that all of x,, x,, and 
x, have zero coefficients in the true regression 
model? Test the relevant hypotheses at level .05. 

d. The mean and standard deviation of x; were 
52.540 and 5.4447, respectively, whereas those of 
x; were 89.195 and 3.6660, respectively. When the 
model involving these two standardized variables 
was fit, the estimated regression equation was y = 
5255 — .0236x4 + .0097x5. What value of specific 
gravity would you predict for a wood sample with 
% springwood = 50 and % light absorption in 
summerwood = 90? 

e. The estimated standard deviation of the estimated 
coefficient Bs of x4 (i.e., for B; of the standardized 
model) was .0046. Obtain a 95% CI for B3. 

f. Using the information in parts (d) and (e), what is the 
estimated coefficient of x; in the unstandardized 
model (using only predictors x, and x;), and what is 
the estimated standard deviation of the coefficient 
estimator (i.e., sg for B: in the unstandardized 
model)? 

g. The estimate of o for the two-predictor model is 
s = .02001, whereas the estimated standard deviation of 
By + B3x3 + B5x'5when x = —.3747andx’, = —.2769 
(Le., when x; = 50.5 and x; = 88.9) is .00482. 
Compute a 95% PI for specific gravity when % 
springwood = 50.5 and % light absorption in 
summerwood = 88.9. 


In the accompanying table, we give the smallest SSE for 
each number of predictors k (k = 1, 2, 3, 4) for a regres- 
sion problem in which y = cumulative heat of hardening 
in cement, x, = % tricalcium aluminate, x, = % trical- 
cium silicate, x; = % aluminum ferrate, and x, = % 
dicalcium silicate. 


Number of 

Predictors k Predictor(s) SSE 
1 x4 880.85 
2 Xy, Xy 58.01 
3 X1, Xp, X3 49.20 
4 Xy, Xq5 Xz, Xq 47.86 


In addition, n = 13 and SST = 2715.76. 
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a. Use the criteria discussed in the text to recommend Predictor Coef SE Coef T P 
the use of a particular regression model. Constant Stee 18.53 —6-45 0.000 
: ; x1 —0.1047 0.2839 —0.37 0.718 
b. Would forward selection result in the best two- 
di del? Explai x2 28.678 3.625 7.91 0.000 
Piecetor model: Expan. x3 0.4074 0.1303 3.13 0.007 
58. The article “Response Surface Methodology for x4 0.2711 0.2606 1.04 0.316 
Protein Extraction Optimization of Red Pepper Seed” xlsqd —0.000752 0.002110 —0.36 0.727 
(Food Sci. and Tech., 2010: 226-231) gave data on the x2sqd —1.6452 0.2110 —7.80 0.000 
response variable y = protein yield (%) and the indepen- Bepae Sees  ealieete One: Depe 
: _ 5 _ a x4sqd —0.015152 0.002110 —7.18 0.000 
dent variables x, = temperature (°C), x, = pH, x,= 
cee . Ad, eer ‘ee x1x2 0.02150 0.02687 0.80 0.437 
extraction time ( min ), an X4 = so vent/meal ratio. ee 6 0G0seo eee Seah 6-88 
a. Fitting the model with the four x,’s as predictors gave x1x4 —0.000800 0.002687 —0.30 0.770 
the following output: x2x3 —0.05900 0.01344 -—4.39 0.001 
X2x4 0.03900 0.02687 1.45 0.169 
predictor Coef SE Coef T P x3x4 0.002725 0.001344 2.03 0.062 
Constant —4.586 2.542 —1.80 0.084 S= 0.268703 R-Sq= 96.7% R-Sq(adj) = 93.4% 
x1 0.01317 0.02707 0.49 0.631 Source DF Ss MS F P 
x2 1.6350 0.2707 6.04 0.000 Regression 14 29.4287 2.1020 29.11 0.000 
«3 0.02883 0.01353 2.13 0.044 Residual Error 14 1.0108 0.0722 
Total 28 30.4395 
x4 0.05400 0.02707 1.99 0.058 : 
Does at least one of the second-order predictors 
appear to be useful? Carry out an appropriate test of 
Source DF ss MS F P hypotheses. 
ise eeaeras pics sisi a ee cede c. From the output in (b), a reasonable conjecture is 
Residual Error 24 10.5513 0.4396 : : : . 4: 
that none of the predictors involving x, are providing 
nobel ae: Une ee useful information. When these predictors are elimi- 
nated, the value of SSE for the reduced regression 
Calculate and interpret the values of R? and adjusted R?. model is 1.1887. Does this support the conjecture? 
Does the model appear to be useful? d. Here is output from Minitab’s best subsets option, 
b. Fitting the complete second-order model gave the with just the single best subset of each size identi- 
following results: fied. Which model(s) would you consider using 
(subject to checking model adequacy)? 
Minitab output for Exercise 58d 
123 4xxx xxx 
Sss 811122 3 
Mallows xX xX xXqqdqqxkxx x xx 
Vars R-Sq R-Sq(adj) Cp §S 1234ddadadaq23434 4 
1 52.7 50.9 174.4 0.73030 x 
2 67.9 65.4 112.5 0.61349 x x 
ee | 75.0 73.1 0.52124 X xX xX 
4 83.4 80.7 50.8 0.45835 xX xX x xX 
5 90.9 88.9 21.4 0.34731 x x x xX 
6 94.6 93.1 7.9 0.27422 xX XX xX XxX X 
7 95.8 94.4 4.7 0.24683 XX xX xX XXX 
8 96.2 94.6 5.1 0.24137 xX X xX xX xX XXX 
9 96.4 94.7 6.1 0.23962 XXX xX xX xX XXX 
10 96.6 94.6 7.5 0.24132 XXXXXK XxX XXX 
11 96.6 94.4 9.4 0.24716 XXXX xXXXX x xX X 
12 96.6 94.1 11.2 0.25328 XXXxX xXXXXX xxx 
13 96.7 93.8 13.1 0.26041 XX XXXXXXXX XXX 
14 96.7 93.4 15.0 0.26870 XX XXXxXXXXXXXXxX 
59. Reconsider the wood specific gravity data referred to in would you recommend for investigating in more 
Exercise 56. detail? 


(a) Minitab’s Best Regression option was used result- 
ing in the accompanying output. Which model(s) 
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Response is spgrav 


s % s 

pss siu 

rouppm 

am £ 1 2 

GQ rwett 

f f£f oO aa 

R-Sq iio bb 

VarsR-Sq (adj) C-p s b bd s--s 
156.4 53.9 10,6 0.021832 x 

110.6 5.7 38.5 0.031245 x 

i 5.3 QO.1 41.7 0.032155 x 

265.5 61.4 7.0 0.019975 x x 
26.4 sd 57.6 9.1 0.020950 xX xX 
2 60.3 55.6 10.2 0.021439 X x 

3:°72;.3 67.1 4.9 0.018461 X x x 

3 7122) 65.8 5.6 0.018807 xX xX x 

3° 7164 65.7 5.6 0.018846 X X x 

4 77.0 70.9 4.0 0.017353 X X X x 

4 74.8 68.1 5.4 0.018179 KX K X x 

4 72.7 65.4 6.7 0.018919 KX K X x 

5 773.0 68.9 6.0 0.017953 X X X XK X 


b. The accompanying Minitab output resulted from apply- 
ing both the backward elimination method and the 
forward selection method. For each method, explain 
what occurred at every iteration of the algorithm. 


Response is spgrav on 5 predictors, 
with N= 20 


Step 2 3 4 
Constant 0.442 0.4384 0.4381 0.5179 
sprngfib 0.0001 0.00011 0.00012 

T-Value A ee 2.595) 1.98 

sumrfib 0.0000 

T-Value 0.12 

sSsprwood —0.0053 —0.00526 —0.00498 —0.00438 
T-Value —5.70 =16:.:56) —5.96 =—5..20 
spltabs —0.0018 —0.0019 

T-Value =21.63 =e 16 

sumltabs 0.0044 0.0044 0.0031 0.0027 
T-Value 3.01 Bids 2.63 2.12 
Ss 0.0180 0.0174 0.0185 0.0200 
R-Sq Ts OS: LT 03 72.27 65.50 
Step L 2 

Constant 0.7585 0.5179 

ssprwood —0.00444 —0.00438 

T-Value —4.82 =5..20: 

sumltabs 0.0027 

T-Value 2.12 

Ss 0.0218 0.0200 

R-Sq 564.316. 65.50 


Pillar stability is a most important factor to ensure safe 
conditions in underground mines. The authors of 
“Developing Coal Pillar Stability Chart Using 
Logistic Regression” (Intl. J. of Rock Mechanics & 
Mining Sci., 2013: 55-60) used a logistic regression 
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model to predict stability. The article reported the fol- 
lowing data on x, = pillar height to width ratio, x, = 
pillar strength to stress ratio, and stability status for 29 
coal pillars. 


ID x, xX, Stable? ID x, xX, Stable? 
1 1.80 2.40 Y 16 0.80 1.37 N 
2 1.65 2.54 Y 17 0.60 1.27 N 
3 2.70 0.84 Y 18 1.30 0.87 N 
4 3.67 1.68 Y 19 0.83 0.97 N 
5 141 2.41 Y 20 0.57 0.94 N 
6 1.76 1.93 Y 21 1.44 1.00 N 
7 2.10 1.77 Y 22 2.08 0.78 N 
8 2.10 1.50 Y 23 1.50 1.03 N 
9 457 2.43 Y 24 1.38 0.82 N 
10 3.59 5.55 Y 25 0.94 1.30 N 
11 8.33 2.58 Y 26 1.58 0.83 N 
12 2.86 2.00 x, 27 1.67 1.05 N 
13 2.58 3.68 MX 28 3.00 1.19 N 
14 2.90 1.13 Y 29 2.21 0.86 N 
15 3.89 2.49 Y 


The corresponding logistic regression output from R is 
given here: 


Coefficients: 

Estimate Std. Error z value Pr(>|z]) 
(Intercept) —13.146 5.184 =2°,53'6 0.0112 
x1 2.774 1.477 1.878 0.0604 
x2 5.668 2.642 2.145 0.0319 


61. 


62. 


a. Using the output with a = .1 to determine whether 
the two predictor variables appear to have a signifi- 
cant impact on pillar stability. 

b. Provide interpretations for e?7” and e°-. 


Reconsider the wood specific gravity data referred to in 
Exercise 56. The following R? values resulted from 
regressing each predictor on the other four predictors (in 
the first regression, the dependent variable was x, and the 
predictors were x,—x5, etc.): .628, .711, .341, .403, and 
.403. Does multicollinearity appear to be a substantial 
problem? Explain. 


A study carried out to investigate the relationship between 
a response variable relating to pressure drops in a screen- 
plate bubble column and the predictors x, = superficial 
fluid velocity, x, = liquid viscosity, and x, = opening 
mesh size resulted in the accompanying data 
(“A Correlation of Two-Phase Pressure Drops in 
Screen-Plate Bubble Column,” Canad. J. of Chem. 
Engr., 1993: 460-463). The standardized residuals and 
h;, values resulted from the model with just x,, x5, and x; 
as predictors. Are there any unusual observations? 
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Data for Exercise 62 


Observation Velocity Viscosity Mesh Size Response Standardized Residual h;; 
1 2.14 10.00 34 28.9 2.01721 202242 
2 4.14 10.00 34 26.1 1.34706 .066929 
] 8.15 10.00 34 22.8 96537 .274393 
4 2.14 2.63 34 24.2 1.29177 224518 
5 4.14 2.63 34 15.7 —.68311 .079651 
6 8.15 2.63 34 18.3 .23785 .267959 
7 5.60 1.25 34 18.1 06456 .076001 
8 4.30 2.63 34 19.1 13131 .074927 
9 4.30 2.63 34 15.4 —.74091 .074927 
10 5.60 10.10 25 12.0 — 1.38857 152317 
11 5.60 10.10 34 19.8 —.03585 .068468 
12 4.30 10.10 34 18.6 —.40699 062849 
13 2.40 10.10 34 13.2 — 1.92274 175421 
14 5.60 10.10 55 22.8 — 1.07990 .712933 
15 2.14 112.00 34 41.8 —1.19311 516298 
16 4.14 112.00 34 48.6 1.21302 513214 
17 5.60 10.10 25, 19.2 38451 152317 
18 5.60 10.10 25 18.4 18750 152317 
19 5.60 10.10 25 15.0 —.64979 152317 
63. Multiple regression output from Minitab for the PAH qd 17.6 23.8 5:7 3.0 7.5 
data of Exercise 53 in the previous section included the 4 8.4 31.6 57 1.0 3.3 
following information: 
ee a eee b .0048 .0073 .0037 0412 .0416 
Obs xl flth Fit SE Fit Residual St Resid q 89.2 60.9 27.5 13.2 12.2 
6 243500 604.7 582.9 40.7 21.8 1.25X a 41.1 26.2 16.4 6.7 9.7 
7 67793 27.7 139.3 12.3 —111.6 —2.62R 
b .0063 .0061 .0036 .0039 = .0025 


R denotes an observation with a large standard- 
ized residual 


X denotes an observation whose X value gives it 
large influence. 


What does this suggest about the appropriateness of 
using the previously given fitted equation as a basis 
for inferences? The investigators actually eliminated 
observation #7 and re-regressed. Does this make 
sense? 


64. The article “Bank Full Discharge of Rivers” (Water 
Resources J., 1978: 1141-1154) reports data on dis- 
charge amount (q, in m?/sec), flow area (a, in m7), and 
slope of the water surface (b, in m/m) obtained at a num- 
ber of floodplain stations. A subset of the data follows. 
The article proposed a multiplicative power model 
O = aabbe. 


Let y = In(q), x, = In(a), and x, = In(b). Consider fit- 

ting the model Y = By + Byx, + Box, + €. 

a. The resulting hs are .138, .302, .266, .604, .464, 
360, .215, .153, .214, and .284. Does any observa- 
tion appear to be influential? 

b. The estimated coefficients are Bo = 1.5652, B, = 
9450, and B, = .1815, and the corresponding esti- 
mated standard deviations are 58, = .7328, 53, = 

1528, and 5g = .1752. The second standardized 

residual is ¢, = 2.19. When the second observation is 
omitted from the data set, the resulting estimated coef- 
ficients are By = 1.8982, 8, = 1.025,andB, = .3085. 
Do any of these changes indicate that the second 
observation is influential? 

c. Deletion of the fourth observation (why?) yields 

By = 1.4592, B, = 9850, and B,=.1515. Is this 
observation influential? 


SUPPLEMENTARY EXERCISES (65-83) 


65. Curing concrete is known to be vulnerable to shock vibra- 
tions, which may cause cracking or hidden damage to the 
material. As part of a study of vibration phenomena, the 
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paper “Shock Vibration Test of Concrete” (ACI 
Materials J., 2002: 361-370) reported the accompanying 
data on peak particle velocity (mm/sec) and ratio of 


66. 


Vars 


w N N FF FF 


ultrasonic pulse velocity after impact to that before 
impact in concrete prisms. 


Obs Ppv Ratio Obs Ppv Ratio 
1 160 .996 16 708 .990 
2 164 .996 17 806 .984 
3 178 .999 18 884 .986 
4 252 .997 19 526 991 
5 293 993 20 490 .993 
6 289 997 21 598 993 
7 415 999 22 505 993 
8 478 997 23 525 .990 
9 391 .992 24 675 991 

10 486 985 25 1211 981 
11 604 995 26 1036 .986 
12 528 995 27 1000 .984 
13 749 .994 28 1151 .982 
14 772 .994 29 1144 .962 
15 532 987 30 1068 .986 


Mallows x xX x x x 

Vars R-Sq R-Sq(adj) Cp Ss zr 2 3 4 5 
3 68.9 68.0 2:5 0.32668 xX x x 

4 69.0 67.9 4.1 0.32754 xX x x 

4 69.0 67.9 4.1 0.32759 xX x 

5 69.0 67.6 6.0 0.32894 x x x 


Supplementary Exercises 611 


b. The cited article recommended the model with just x, 
and x; as predictors. The following Minitab output 
resulted from fitting that model. 


Predictor Coef SE Coef a Pp 
Constant 0.9402 0.2814 3.34 0.001 
x3 —0.00004639 0.00001104 —-4.20 0.000 
x5 0.73710 0.04813 15.31 0.000 
S = 0.326304 R-Sq = 68.7% R-Sq(adj) = 68.1% 
Analysis of Variance 

Source DF ss MS F PB 
Regression 2 25.925 12.962 121.74 0.000 
Residual Error 111 11.819 0.106 

Total 113 37.744 


Transverse cracks appeared in the last 12 prisms, whereas 

there was no observed cracking in the first 18 prisms. 

a. Construct a comparative boxplot of ppv for the 
cracked and uncracked prisms and comment. Then 
estimate the difference between true average ppv for 
cracked and uncracked prisms in a way that conveys 
information about precision and reliability. 

b. The investigators fit the simple linear regression 
model to the entire data set consisting of 30 observa- 
tions, with ppv as the independent variable and ratio 
as the dependent variable. Use a statistical software 
package to fit several different regression models, 
and draw appropriate inferences. 


The article “Applying Regression Analysis to Improve 
Dyeing Process Quality: A Case Study” (nil. J. of 
Advanced Manuf. Tech., 2010: 357-368) examined the 
practice of adjust pH of dye liquor at a large manufac- 
turer of automotive carpets. The investigation was based 
on a data set consisting of 114 observations included in 
the article). The dependent variable is y = pH before 
addition of dyes, and the predictors are x, = carpet 
density (oz/yd*), x, = carpet weight (Ib), x, = dye 
weight (g), x, = dye weight as a percentage of carpet 
weight (%), and x; = pH after addition of dyes. 
a. Here is output from Minitab’s Best Subsets 
Regression option. Which model(s) would you rec- 
ommend, and why? 


Mallows x x x x Pa 
R-Sq R-Sq(adj) Cp s 1 2 3 4 =«5 
63.7 63.4 16.6 0.34971 x 
4.4 3.5 223.6 0.56773 x 
68.7 68.1 1.2 0.32630 4 x 
68.6 68.0 1.6 0.32684 Xx x 
69.0 68.2 2.2 0.32616 xX x x 


67. 


Does this model appear to specify a useful relation- 
ship between the response variable and the predic- 
tors? [Note: The pattern in a normal probability plot 
of the standardized residuals is very linear. The plots 
of standardized residuals against both x, and x; show 
no discernible pattern. There is one observation 
whose x; value is more than twice as large as for any 
other observation, but with n = 114, this observation 
has very little influence on the fit.] 

c. Should either one of the two predictors be eliminated 
from the model provided that the other predictor is 
retained? Explain your reasoning. 

d. Calculate and interpret 95% CIs for the B coeffi- 
cients of the two model predictors. 

e. The estimated standard deviation of Y when x, = 
1000 and x; = 6 is .0336. Obtain and interpret a 95% 
CI for true average pH before addition of dyes under 
these circumstances. 


The article “Validation of the Rockport Fitness 
Walking Test in College Males and Females” 
(Research Quarterly for Exercise and Sport, 1994: 
152-158) recommended the following estimated regres- 
sion equation for relating y = VO,max (L/min, a mea- 
sure of cardiorespiratory fitness) to the predictors 
x, = gender (female = 0, male = 1), x, = weight (Ib), 
X3 = 1-mile walk time (min), and x, = heart rate at the 
end of the walk (beats/min): 


y = 3.5959 + .6566x, + .0096x, 
—.0996x, — .0080x, 


a. How would you interpret the estimated coefficient 
B; = —.0996? 

b. How would you interpret the estimated coefficient 
B, = .6566? 
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68. 


69. 


c. Suppose that an observation made on a male whose 
weight was 170 lb, walk time was 11 min, and heart 
rate was 140 beats/min resulted in VO,max = 3.15. 
What would you have predicted for VO,max in this 
situation, and what is the value of the corresponding 
residual? 

d. Using SSE = 30.1033 and SST = 102.3922, what 
proportion of observed variation in VO,max can be 
attributed to the model relationship? 

e. Assuming a sample size of n = 20, carry out a test of 
hypotheses to decide whether the chosen model 
specifies a useful relationship between VO,max and 
at least one of the predictors. 


Feature recognition from surface models of compli- 
cated parts is becoming increasingly important in the 
development of efficient computer-aided design (CAD) 
systems. The article “A Computationally Efficient 
Approach to Feature Abstraction in Design- 
Manufacturing Integration” (J. of Engr. for 
Industry, 1995: 16-27) contained a graph of log,,(total 
recognition time), with time in sec, versus log ,,)(number 
of edges of a part), from which the following represen- 
tative values were read: 


Log(edges) Lt 1.5 1.7 19 2.0 2.1 
Log(time) 30) 50 55 52 ~~ 85 .98 


Log(edges) 2.2. 2.3 21 2.8 3.0 3:3 
Log(time) 1.10 1.00 1.18 145 165 1.84 


Log(edges) 3.5 3.8 4.2 4.3 
Log(time) 2.05 2.46 2.50 2.76 


a. Does a scatterplot of log(time) versus log(edges) 
suggest an approximate linear relationship between 
these two variables? 

b. What probabilistic model for relating y = recognition 
time to x = number of edges is implied by the simple 
linear regression relationship between the trans- 
formed variables? 

ce. Summary quantities calculated from the data are 


n=16 x= 424 


D(x})? = 126.34 
Uxty, = 68.640 


Sy! = 21.69 
S(y))? = 38.5305 


Calculate estimates of the parameters for the model 
in part (b), and then obtain a point prediction of time 
when the number of edges is 300. 


Air pressure (psi) and temperature (°F) were measured 
for a compression process in a certain piston-cylinder 
device, resulting in the following data (from Introduction 
to Engineering Experimentation, Prentice-Hall, Inc., 
1996, p. 153): 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


70. 


71. 


Pressure 20.0 404 60.8 80.2 100.4 
Temperature 44.9 102.4 142.3 164.8 192.2 
Pressure 120.3 141.1 161.4 181.9 201.4 
Temperature 221.4 2284 249.5 269.4 270.8 
Pressure 220.8 241.8 261.1 280.4 300.1 
Temperature 291.5 287.3 313.3 322.3 325.8 
Pressure 320.6 341.1 360.8 

Temperature 337.0 332.6 342.9 


a. Would you fit the simple linear regression model to 
the data and use it as a basis for predicting tempera- 
ture from pressure? Why or why not? 

b. Find a suitable probabilistic model and use it as a 
basis for predicting the value of temperature that 
would result from a pressure of 200, in the most 
informative way possible. 


An aeronautical engineering student carried out an 
experiment to study how y = lift/drag ratio related to 
the variables x, = position of a certain forward lifting 
surface relative to the main wing and x, = tail place- 
ment relative to the main wing, obtaining the follow- 
ing data (Statistics for Engineering Problem Solving, 
p. 133): 


x, (in.) Xx, (in.) y 
—1.2 =A2 858 
= 12 0 3.156 
1:2, 1,2 3.644 

0 = 1,2 4.281 
0 0 3.481 
0 1:2 3.918 
1.2 —1.2 4.136 
1.2 0 3.364 
1.2 1:2 4.018 


y = 3.428, SST = 8.55 


a. Fitting the first-order model gives SSE = 5.18, 
whereas including x; = xx, as a predictor results in 
SSE = 3.07. Calculate and interpret the coefficient 
of multiple determination for each model. 

b. Carry out a test of model utility using a = .05 for 
each of the models described in part (a). Does either 
result surprise you? 


An ammonia bath is the one most widely used for depos- 
iting Pd-Ni alloy coatings. The article ‘Modelling of 
Palladium and Nickel in an Ammonia Bath in a 
Rotary Device” (Plating and Surface Finishing, 1997: 
102-104) reported on an investigation into how bath- 
composition characteristics affect coating properties. 
Consider the following data on x, = Pd concentration 


72. 


(g¢/dm3), x, =Niconcentration (g/dm*), x, = pH, 
x4 = temperature (°C), x; = cathode current density 
(A/dm?), and y = palladium content (%) of the coating. 


pdconc niconc pH temp currdens _pallcont 
16 24 9.0 35 5 61.5 
8 24 9.0 35 g) 51.0 
16 16 9.0 35 3 81.0 
8 16 9.0 35 5 50.9 
16 24 8.0 35 3 66.7 
8 24 8.0 35 5 48.8 
16 16 8.0 35 5 71.3 
8 16 8.0 35 3 62.8 
16 24 9.0 25 3 64.0 
8 24 9.0 25 5 37.7 
16 16 9.0 25 5 68.7 
8 16 9.0 25 3 54.1 
16 24 8.0 25 5 61.6 
8 24 8.0 25 3 48.0 
16 16 8.0 25 3 73.2 
8 16 8.0 25 5 43.3 
4 20 8.5 30 4 35.0 
20 20 8.5 30 4 69.6 
12 12 8.5 30 4 70.0 
12 28 8.5 30 4 48.2 
12 20 7.5 30 4 56.0 
12 20 9.5 30 4 77.6 
12 20 8.5 20 4 55.0 
12 20 8.5 40 4 60.6 
12 20 8.5 30 2 54.9 
12 20 8.5 30 6 49.8 
12 20 8.5 30 4 54.1 
12 20 8.5 30 4 61.2 
12 20 8.5 30 4 52.5 
12 20 8.5 30 4 57.1 
12 20 8.5 30 4 52.5 
12 20 8.5 30 4 56.6 


a. Fit the first-order model with five predictors and 
assess its utility. Do all the predictors appear to be 
important? 

b. Fit the complete second-order model and assess its 
utility. 

c. Does the group of second-order predictors (interaction 
and quadratic) appear to provide more useful informa- 
tion about y than is contributed by the first-order pre- 
dictors? Carry out an appropriate test of hypotheses. 

d. The authors of the cited article recommended the use 
of all five first-order predictors plus the additional 
predictor x, = (pH). Fit this model. Do all six pre- 
dictors appear to be important? 


The article “An Experimental Study of Resistance 
Spot Welding in 1 mm Thick Sheet of Low Carbon 
Steel” (J. of Engr. Manufacture, 1996: 341-348) dis- 
cussed a Statistical analysis whose basic aim was to 
establish a relationship that could explain the variation in 
weld strength (y) by relating strength to the process char- 
acteristics weld current (wc), weld time (wt), and elec- 
trode force (ef). 


73. 


74. 
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a. SST = 16.18555, and fitting the complete second- 
order model gave SSE = .80017. Calculate and 
interpret the coefficient of multiple determination. 

b. Assuming that n = 37, carry out a test of model util- 
ity [the ANOVA table in the article states that 
n — (k + 1) = 1, but other information given contra- 
dicts this and is consistent with the sample size we 
suggest]. 

c. The given F ratio for the current—time interaction 
was 2.32. If all other predictors are retained in the 
model, can this interaction predictor be eliminated? 
(Hint: As in simple linear regression, an F ratio for a 
coefficient is the square of its f ratio.] 

d. The authors proposed eliminating two interaction pre- 
dictors and a quadratic predictor and recommended the 
estimated equation y = 3.352 + .098we + .222wt 4 
.297ef — .0102(wt)? — .037(et)? + .0128(wc)(wt). 
Consider a weld current of 10 kA, a weld time of 12 ac 
cycles, and an electrode force of 6 KN. Supposing that 
the estimated standard deviation of the predicted 
strength in this situation is .0750, calculate a 95% PI 
for strength. Does the interval suggest that the value of 
strength can be accurately predicted? 


The accompanying data on x = frequency (MHz) and 
y = output power (W) for a certain laser configuration 
was read from a graph in the article “Frequency 
Dependence in RF Discharge Excited Waveguide CO, 
Lasers” WEEE J. of Quantum Electronics, 1984: 
509-514). 


60 63 77 100 125 157 186 222 


16 17 19 21 22 20 15 ) 


A computer analysis yielded the following informa- 
tion for a quadratic regression model: By) = —1.5127, 


B, = 391901, B, = —.00163141, 5g, = -00003391, 
SSE = .29,SST= 202.88, and s;=.1141 when 
x = 100. 


a. Does the quadratic model appear to be suitable for 
explaining observed variation in output power by 
relating it to frequency? 

b. Would the simple linear regression model be nearly 
as satisfactory as the quadratic model? 

c. Do you think it would be worth considering a cubic 
model? 

d. Compute a 95% CI for expected power output when 
frequency is 100. 

e. Use a 95% PI to predict the power from a single 
experimental run when frequency is 100. 


The accompanying data on x, = card cylinder speed 
(rpm), card production rate (kg/h), x, = number of draw 
frame doubling, and y = tenacity (RKM) appeared in the 
article “Impact of Carding Parameters and Draw 
Frame Doubling on the Properties of Ring Spun 
Yarn” (J. of Engineered Fibers and Fabrics, 2013: 
72-78). 
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Obs xy Xy X; tenacity Obs x, Xy X; tenacity 
1 700 80 6 18.29 9 800 80 5 16.59 
2 900 80 6 17.97 10 800 120 5 17.10 
3 700 120 6 17.12 11 800 80 7 15.64 
4 900 120 6 17.46 12 800 120 7 16.54 
5 700 100 5 17.29 13 800 100 6 17.81 
6 900 100 5 16.35 14 800 100 6 17.68 
7 700 100 7 16.77 15 800 100 6 18.11 
8 900 100 7 16.11 
Minitab’s Best Subsets Regression option gave the predictors relative to the sample size. Which model(s) 
following output when applied to the complete second- would you recommend, and why? [Note: The cited 
order model. Notice that adjusted R? for the model article reported results only for the model with all 
containing all predictors is much smaller than R* first- and second-order predictors. ] 
itself, indicating that the model contains too many 
x 
1 2 3 
x x x 
s s s ul 1 2 
Mallows x x x gq g q x x x 
Vars R-Sq R-Sq (adj) Cp Ss 1 2 3 cd dd 2 3 3 
al 10.9 4.0 11.6 0.76552 xX 
1 10.3 3.3 11.7 0.76826 xX 
2 73.4 69.0 —2.3 0.43510 x x 
2 13.9 0.0 12.8 0.78320 xX x 
3 Ped 70.8 -1.2 0.42208 xX x X 
3 TT Al 70.8 1.2 0.42229 xX xX x 
4 T1338 68.2 0.7 0.44046 xX X X X 
4 77.2 68.1 0.8 0.44115 xX X xX xX 
5 79.6 68.2 2.2 0.44073 X x xX xX X 
5 79.3 67.9 2.2 0.44302 xX xX xX x x 
6 80.0 65.0 4.1 0.46247 xX x Xx xX xX x 
6 79.8 64.7 4.1 0.46426 xX xX xX xX xX x 
7 80.2 60.4 6.0 0.49156 xX x xX xX xX xX xX 
7 80.0 60.0 6.1 0.49408 xX xk xX Xx xX xX x 
8 80.2 53.9 8.0 0.53059 xX xX xX X xX xX x xX 
8 80.2 53.8 8.0 0.53094 xX xX xX xX xX xX x x 
9 80.2 44.7 10.0 0.58123 xX xX xX xX xX xX xX x x 
b. When the model with predictors x;, x3, and x, was d. For the model of part (c), the standard deviation of a 


fit, the t ratio corresponding to the coefficient on x, 
was — 1.32. If the first two predictors remain in the 
model, is inclusion of x, justified? Explain your 


predicted Y value when x; = 6 is .164. Predict tenac- 
ity in this situation in a way that conveys information 
about precision and reliability. 


reasoning. 

c. Here is output from relating y to x, via the quadratic 
regression model. A normal probability plot of the 
standardized residuals is quite straight, and the plot 
of e* versus ) shows no discernible pattern. Does 
this model specify a useful relationship, and should 
the quadratic predictor be retained in the model? 


75. The effect of manganese (Mn) on wheat growth is exam- 
ined in the article “Manganese Deficiency and Toxicity 
Effects on Growth, Development and Nutrient 
Composition in Wheat” (Agronomy J., 1984: 213- 
217). A quadratic regression model was used to relate 
y = plant height (cm) to x = log ,)(added Mn), with 
LM as the units for added Mn. The accompanying data 
was read from a scatterplot appearing in the article. 

Coef 


Predictor SE Coef T Pp 


x |=1.0 —4 0 2 1.0 
Constant —24.743 8.040 —3.08 0.010 

y 32 37 44 45 46 
x3 14.457 2.707 5.34 0.000 
x3 sqd —1.2284 0.2252 —5.46 0.000 a 2.0 2.8 3.2 3.4 4.0 
S = 0.435097 R-Sq = 73.4% R-Sq(adj) = 69.0% y 42 42 40 37 30 
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In addition, By = 41.7422, B, = 6.581, B, = —2.3621, 

5g, = 8522, sg = 1.002, sg = .3073, and SSE = 26.98. 

a. Is the quadratic model useful for describing the rela- 
tionship between x and y? [Hint: Quadratic regres- 
sion is a special case of multiple regression with 
k =2,x, =x, and x, = x?.] Apply an appropriate 
procedure. 

b. Should the quadratic predictor be eliminated? 

c. Estimate expected height for wheat treated with 
10 uM of Mn using a 90% CI. [Hint: The estimated 
standard deviation of B, + B, + B, is 1.031.] 


76. The article ‘“Chemithermomechanical Pulp from Mixed 
High Density Hardwoods” (TAPPI, July 1988: 145-146) 
reports on a study in which the accompanying data was 
obtained to relate y = specific surface area (cm?/g) to 
x, = % NaOH used as a pretreatment chemical and 
X» = treatment time (min) for a batch of pulp. 


mal xy y 
3 30 5.95 
3 60 5.60 
3 90 5.44 
9 30 6.22 
9 60 5.85 
9 90 5.61 

15 30 8.36 

15 60 7.30 

15 90 6.43 


The accompanying Minitab output resulted from a 
request to fit the model Y = By + B,x, + Box, + €. 


The regression equation is 
AREA =6.05+0.142 NAOH— 0.0169 TIME 


Predictor Coef Stdev t-ratio p 
Constant 6.0483 0.5208 11.61 0.000 
NAOH 0.14167 0.03301 4.29 0.005 
TIME —0.016944 0.006601 =2.57 0.043 
s=0.4851 R-sq = 80.7% R-sq(adj) = 74.2% 


Analysis of Variance 


SOURCE DF ss MS F Pp 
Regression 2 5.8854 2.9427 12.51 0.007 
Error 6 1.4118 0.2353 

Total 8 7.2972 


a. What proportion of observed variation in spe- 
cific surface area can be explained by the model 
relationship? 

b. Does the chosen model appear to specify a useful 
relationship between the dependent variable and the 
predictors? 

c. Provided that % NaOH remains in the model, would 
you suggest that the predictor treatment time be 
eliminated? 
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d. Calculate a 95% CI for the expected change in spe- 
cific surface area associated with an increase of 1% 
in NaOH when treatment time is held fixed. 

e. Minitab reported that the estimated standard deviation 
of By + B,(9) + B,(60) is .162. Calculate a prediction 
interval for the value of specific surface area to be 
observed when % NaOH = 9 and treatment time = 60. 


77. The article “Sensitivity Analysis of a 2.5 kW Proton 


Exchange Membrane Fuel Cell Stack by Statistical 
Method” (J. of Fuel Cell Sci. and Tech., 2009: 1-6) 
used regression analysis to investigate the relation- 
ship between fuel cell power (W) and the independent 
variables x, = H,pressure (psi), x, = H, flow (stoc), 
Xx, = air pressure (psi) and x, = airflow (stoc). 
a. Here is Minitab output from fitting the model with 
the aforementioned independent variables as predic- 
tors (also fit by the authors of the cited article): 


Predictor Coef SE Coef T P 
Constant 1507.3 206.8 7.29 0.000 
x1 —4.282 4.969 —0.86 0.407 
x2 7.46 62211 0.12 0.907 
x3 —0.9162 0.6227 —-1.47 0.169 
x4 90.60 24.84 3.65 0.004 
S =4.6885 R-Sq =59.6% R-Sq(adj) = 44.9% 
Source DF Ss MS F P 
Regression 4 40048 10012 4.06 0.029 
Residual Error 1 27158 2469 

Total 15 67206 


a. Does there appear to be a useful relationship between 
power and at least one of the predictors? Carry out a 
formal test of hypotheses. 


b. Fitting the model with predictors x,, x,, and the inter- 
action x,x, gave R? = .834. Does this model appear 
to be useful? Can an F test be used to compare this 
model to the model of (a)? Explain. 


c. Fitting the model with predictors x, — x4 as well as 
all second-order interactions gave R? = .960 (this 
model was also fit by the investigators). Does it 
appear that at least one of the interaction predictors 
provides useful information about power over and 
above what is provided by the first-order predictors? 
State and test the appropriate hypotheses using a 
significance level of .05. 


78. Coir fiber, derived from coconut, is an eco-friendly mate- 


rial with great potential for use in construction. The 
article “Seepage Velocity and Piping Resistance of 
Coir Fiber Mixed Soils” (J. of Irrig. and Drainage 
Engr., 2008: 485-492) included several multiple regres- 
sion analyses. The article’s authors kindly provided the 
accompanying data on x, = fiber content(% ), x, = 
fiber length(mm), x, = hydraulic gradient (no unit pro- 
vided), and y = seepage velocity (cm/sec). 
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Obs cont Ingth grad vel Source DE Ss MS F P 
1 0.0 0 0.400 0.027 Regression 3.0.129898 0.043299 164.27 0.000 
Residual Error 45 0.011862 0.000264 
2 0.0 0 0.716 0.050 eas oy leaaaee 
a , see a How would you interpret the number —.0003020 in 
5 0.0 0 1.226 0.107 the Coef column on output? 
6 0.0 0 1.427 0.140 b. Does fiber content appear to provide useful informa- 
7 0.0 0 1.709 0.178 tion about velocity provided that fiber length and 
8 0.0 0 1.872 0.200 hydraulic gradient remain in the model? Carry out a 
9 0.5 50 0.380 0.022 test of hypotheses. 
10 0.5 50 0.774 0.040 c. Fitting the model with just fiber length and hydraulic 
1 0.5 50 1.056 0.060 gradient as predictors gave the estimated regression 
12 0.5 50 1.329 0.111 coefficients 8, = —.005315, 8, = —.0004968, and 
13 0.5 50 1.598 0.158 B, = .102204 (the ¢ ratios for these two predictors are 
/ 0.5 50 1.799 0.188 both highly significant). In addition, s; = .00286 when 
i LO 50 oe 0.026 fiber length = 25 and hydraulic gradient = 1.2. Is 
" a 30 can oe there convincing evidence that true average velocity is 
. - me se ee something other than .1 in this situation? Carry out a 
19 10 50 1.090 0.070 test using a significance level of .05. 
20 1.0 50 1.239 0.088 d. Fitting the complete second-order model (as did the 
21 1.0 50 1.496 0.111 article’s authors) resulted in SSE = .003579. Does it 
22 1.0 50 1.744 0.134 appear that at least one of the second-order predic- 
23 1.0 50 1.915 0.145 tors provides useful information over and above what 
24 1.5 50 0.444 0.014 is provided by the three first-order predictors? Test 
25 1.5 50 0.821 0.037 the relevant hypotheses. 
- ie 7 ae ee 79. The article “A Statistical Analysis of the Notch 
28 15 50 1.581 0.112 Toughness of 9% Nickel Steels Obtained from 
29 15 50 1.983 0.144 Production Heats” (J. of Testing and Eval., 1987: 
30 1.0 25 0.462 0.028 355-363) reports on the results of a multiple regression 
31 1.0 25 0.705 0.059 analysis relating Charpy v-notch toughness y (joules) to 
32 1.0 25 0.987 0.084 the following variables: x, = plate thickness (mm), 
33 1.0 25 1.154 0.101 Xx, = carbon content (%), x, = manganese content (%), 
34 1.0 25 1.479 0.150 X4 = phosphorus content (%) x; = sulphur content (%), 
35 1.0 25 1.786 0.194 xX, = silicon content (%), x, = nickel content (%), 
36 1.0 25 1.957 0.218 Xg = yield strength (Pa), and x, = . tensile strength (Pa) 
37 1.0 40 0.419 0.030 a. The best possible subsets involved adding variables 
38 1.0 40 0.705 0.050 in the order x5, Xg, X¢, 3, Xp, X7, X9, X,, and x,. The 
. . 7 eee Bae values of R?, MSE,, and C, are as follows: 
41 1.0 40 1.470 0.126 No. of Predictors 1 2 3 4 
42 1.0 40 1.744 0.168 Fi 
43 1.0 60 0.436 0.034 Ry St ee ee 
44 1.0 60 0.650 0.051 MSE, 2295 1948 1742 1607 
45 1.0 60 0.889 0.068 Cy 314 173, 89.6 35.7 
46 1.0 60 1.222 0.093 
ie : 2 i ee No. of Predictors | 5 6 7 8 9 
49 1.0 60 1.983 0.173 R; 462; 570. .572: -575 ~ 575 
a. Here is output from fitting the model with the three a ae A ' 7 a 


x;’8 as predictors: 


Predictor Coef SE Coef T P 

Constant —0.002997 0.007639 —0.39 0.697 Which model would you recommend? Explain the 
fib cont —0.012125 0.007454 ~—1.63 0.111 stiannledoe-voub cholee 

fib Ingth -—0.0003020 0.0001676 ~—1.80 0.078 ¥ . 

hyd grad 0.102489 0.004711 21.76 0.000 b. The authors also considered second-order models 


involving predictors x; and x;x;. Information on the 


best such models starting with the variables x5, x3, x, 


bh 
oe 


S=0.0162355 R-Sq=91.6% R-Sq(adj) =91. 
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80. 


81. 


X¢ X7, and xz is as follows (in going from the best four- 
predictor model to the best five-predictor model, x, was 
deleted and both x,x, and x,x, were entered, and x, was 
reentered at a later stage): 


No. of Predictors 1 2 3 4 5 


R; AIS 541 .600 .629  .650 
MSE, 2079 1636 1427 1324 1251 
C, 433 109 104 524 16.5 


No. of Predictors 6 7 8 9 10 


R? 652.655. .658 = .659_~—.659 
MSE, 1246 1237 1229 1229 1230 
C, 149 11.2 8.5 9.2 11.0 


Which of these models would you recommend, and 
why? [Note: Models based on eight of the original vari- 
ables did not yield marked improvement on those under 
consideration here. ] 


A sample of n = 20 companies was selected, and the 

values of y = stock price and k = 15 variables (such as 

quarterly dividend, previous year’s earnings, and debt 
ratio) were determined. When the multiple regression 
model using these 15 predictors was fit to the data, 

R? = .90 resulted. 

a. Does the model appear to specify a useful relation- 
ship between y and the predictor variables? Carry out 
a test using significance level .05. [Hint: The F criti- 
cal value for 15 numerator and 4 denominator df is 
5.86.] 

b. Based on the result of part (a), does a high R? value 
by itself imply that a model is useful? Under what 
circumstances might you be suspicious of a model 
with a high R? value? 

c. With 7 and kas given previously, how large would R? 
have to be for the model to be judged useful at the 
.05 level of significance? 


Does exposure to air pollution result in decreased life 
expectancy? This question was examined in the article 
‘Does Air Pollution Shorten Lives?” (Statistics and 
Public Policy, Reading, MA, Addison-Wesley, 1977). 
Data on 


y = total mortality rate (deaths per 10,000) 


x, = mean suspended particle reading (ug/m?) 


x, = smallest sulfate reading ([wg/m*] X 10) 


x3 = population density (people/mi’) 
X,4 = (percent nonwhite) x 10 


Xs = (percent over 65) X 10 


82. 


83. 
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for the year 1960 was recorded for n = 117 randomly 
selected standard metropolitan statistical areas. The esti- 
mated regression equation was 


y = 19.607 + .041x, + .071x, 
+ OO1x, + .041x, + .687x; 


a. For this model, R? = .827. Using a .05 significance 
level, perform a model utility test. 


b. The estimated standard deviation of B, was .016. 
Calculate and interpret a 90% CI for B,. 


c. Given that the estimated standard deviation of By is 
.007, determine whether percent nonwhite is an 
important variable in the model. Use a .01 signifi- 
cance level. 


d. In 1960, the values of x,, x,, x3, x4, and x; for 
Pittsburgh were 166, 60, 788, 68, and 95, respec- 
tively. Use the given regression equation to predict 
Pittsburgh’s mortality rate. How does your prediction 
compare with the actual 1960 value of 103 deaths per 
10,000? 


Given that R? = .723 for the model containing predictors 
X1,X4, Xs, and xg and R? = .689 for the model with predic- 
tors x), x3, Xs, and x,, what can you say about R? for the 
model containing predictors 


a. X), X3, X4, Xs, X, and xg? Explain. 


b. x, and x,? Explain. 


An article in Lubrication Engr. (“Accelerated Testing 
of Solid Film Lubricants,’ 1972: 365-372) reported 
on an investigation of wear life (y, in hr) for solid film 
lubricant. Three sets of journal bearing tests were run 
on a Mil-L-8937-type film at each combination of 
three speeds (x,, in rpm), and three loads (x,, in 1000s 
of hr). The values of x, for the resulting 27 observa- 
tions were 20, 20,...,20, 60,...,60, 100,..., 100, and 
the values of x, were 3,3,3,6, 6,6, 10, 10, 10,3, 3,3,..., 
10, 10, 10. The corresponding values of y = wear life 
(hr) were 300.2, 310.8, 333.0, 99.6, 136.2, 142.4, 20.2, 
28.2, 102.7, 67.3, 77.9, 93.9, 43.0, 44.5, 65.9, 10.7, 
34.1, 39.1, 26.5, 22.3, 34.8, 32.8, 25.6, 32.7, 2.3, 4.4, 
and 5.8. 

The investigators commented that a lognormal 
distribution is appropriate for Y because In(Y) is known 
to follow a normal law, and then proposed the multiplica- 
tive power regression model Y = ax#ixSre, 


a. Estimate the model parameters. 

b. Interpret R? for the transformed model, and then 
carry out a model utility test. 

c. Does it appear that both predictors provide useful 
information about wear life? 


d. Predict wear life when speed is 50 and load is 5 ina 
way that conveys information about precision and 
reliability. 
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Goodness-of-Fit Tests 


and Categorical Data 
Analysis 


INTRODUCTION 


In the simplest type of situation considered in this chapter, each observation 
in a sample is classified as belonging to one of a finite number of categories 
(e.g., blood type could be one of the four categories O, A, B, or AB). Let p; 
denoting the probability that any particular observation belongs in category / 
(or the proportion of the population belonging to category /). We then wish to 
test a null hypothesis that completely specifies the values of all the p;’s (such as 
Ho: P; = -45, P> = .35, P3 = .15, P, = .05, when there are four categories). The 
test statistic is based on how different the numbers of observations in the vari- 
ous categories are from the corresponding expected numbers when H, is true. 
Because the reference distribution for determining the P-value is a chi-squared 
distribution, the procedure is called a chi-squared goodness-of-fit test. 

Sometimes the null hypothesis specifies that the p,s depend on some 
smaller number of parameters without specifying the values of these param- 
eters. For example, with three categories the null hypothesis might state that 
Pp, = 6, p, = 20(1 — 6), and p; = (1 — 6). For a chi-squared test to be per- 
formed, the values of any unspecified parameters must be estimated from the 
sample data. Section 14.2 develops methodology for doing this. The methods 
are then applied to test a null hypothesis that states that the sample comes from 
a particular family of distributions, such as the Poisson family (with ys estimated 
from the sample) or the normal family (with and o estimated). In addition, a 
test based on a normal probability plot is presented for the null hypothesis of 
population normality. 

Chi-squared tests for two different situations are considered in Section 14.3. 
In the first, the null hypothesis states that the p;’s are the same for several dif- 


ferent populations. The second type of situation involves taking a sample from 
619 
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a single population and classifying each individual with respect to two different 
categorical factors (such as religious preference and political-party registration). 
The null hypothesis in this situation is that the two factors are independent within 
the population. 


14.1 Goodness-of-Fit Tests When Category 


Probabilities Are Completely Specified 


A binomial experiment consists of a sequence of independent trials in which each 
trial can result in one of two possible outcomes: S$ (for success) and F (for failure). 
The probability of success, denoted by p, is assumed to be constant from trial to trial, 
and the number n of trials is fixed at the outset of the experiment. In Chapter 8, we 
presented a large-sample z test for testing Hy: p = po. Notice that this null hypoth- 
esis specifies both P(S) and P(F), since if P(S) = po, then P(F) = 1 — py. Denoting 
P(F) by g and | — py by qo, the null hypothesis can alternatively be written as 
Ho: P = Po I = J. The z test is two-tailed when the alternative of interest is p # po. 

A multinomial experiment generalizes a binomial experiment by allowing 
each trial to result in one of k possible outcomes, where k > 2. For example, suppose 
a store accepts three different types of credit cards. A multinomial experiment would 
result from observing the type of credit card used—type 1, type 2, or type 3—by each 
of the next n customers who pay with a credit card. In general, we will refer to the k 
possible outcomes on any given trial as categories, and p, will denote the probability 
that a trial results in category i. If the experiment consists of selecting n individuals 
or objects from a population and categorizing each one, then p; is the proportion of 
the population falling in the ith category (such an experiment will be approximately 
multinomial provided that n is much smaller than the population size). 

The null hypothesis of interest will specify the value of each p;. For example, 
in the case k = 3, we might have Hp: p, = .5, p, = .3, p, = .2. The alternative 
hypothesis will state that H) is not true—that is, that at least one of the p;’s has a 
value different from that asserted by Hy (in which case at least two must be different, 
since they sum to 1). The symbol p,. will represent the value of p; claimed by the null 
hypothesis. In the example just given, pj) = .5, Po = .3, and Py = .2. 

Before the multinomial experiment is performed, the number of trials that 
will result in category i (i = 1, 2,..., or k) is a random variable—just as the num- 
ber of successes and the number of failures in a binomial experiment are random 
variables. This random variable will be denoted by N, and its observed value by n,. 
Since each trial results in exactly one of the k categories, XN, = n, and the same is 
true of the n’s. As an example, an experiment with n = 100 and k = 3 might yield 
N, = 46, N, = 35, and N, = 19. 

The expected number of successes and expected number of failures in a bino- 
mial experiment are np and nq, respectively. When H): p = Po. d = qo 18 true, the 
expected numbers of successes and failures are np) and ndp, respectively. Similarly, 
in a multinomial experiment the expected number of trials resulting in category i is 
EW,) = np,(i = 1,...,k). When Ap: p, = Pio. «+» Py = Pyo 18 true, these expected 
values become E(N,) = npjo, E(N3) = npr; --- E(N,) = npyp. For the case k=3, 
Ay: Pp, = 5, Ps = 3, p; = -2, and n = 100, the expected frequencies when H) is true 
are E(N,) = 100(.5) = 50, E(N,) = 30, and E(N,) = 20. The n;s and corresponding 
expected frequencies are often displayed in a tabular format as shown in Table 14.1. 
The expected values when H) is true are displayed just below the observed values. 
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The N;’s and n;s are often referred to as observed cell counts (or observed cell 


frequencies), and np, NP --.,Py are the corresponding expected cell counts 
under H). 


Table 14.1. Observed and Expected Cell Counts 


Category i=1 i=2 are i=k Row total 


Observed ny Ny Pee Ny n 


Expected MP0 NP» eer MP Ko n 


The n;s should all be reasonably close to the corresponding np,)’s when H, is 
true. On the other hand, several of the observed counts should differ substantially from 
these expected counts when the actual values of the p;’s differ markedly from what the 
null hypothesis asserts. The test procedure involves assessing the discrepancy between 
the n,’s and the np,.’s. It is natural to base a measure of discrepancy on the squared 
deviations (n, — npjo), (My — NPo)*, -.- » (My — NPyo)?. A Seemingly sensible way to 
combine these into an overall measure is to add them together to obtain =(n; — npj)?. 
However, suppose np) = 100 and np, = 10. Then if n, = 95 and n, = 5, the two 
categories contribute the same squared deviations to the proposed measure. Yet n, is 
only 5% less than what would be expected when H) is true, whereas n, is 50% less. 
To take relative magnitudes of the deviations into account, each squared deviation 
is divided by the corresponding expected count. 

Before giving a more detailed description, we must reintroduce a type of prob- 
ability distribution called the chi-squared (y*) distribution. This distribution was 
first encountered in Section 4.4 and was used in Chapter 7 to obtain a confidence 
interval for the variance o” of a normal population. The chi-squared distribution has 
a single parameter v, called the number of degrees of freedom (df) of the distribu- 
tion, with possible values 1, 2, 3,.... If Y ~ y* with v df, then E(Y) = v and V(Y) = 
2v. Figure 14.1 shows a typical x? density curve; it is positively skewed, but moves 
rightward and becomes more symmetric and spread out as v increases. 


x? density curve 


0 


Figure 14.1 A typical chi-squared density curve (small v). 


THEOREM Provided that np; = 5 for every i (i = 1, 2,..., k), the variable 


(N; — np) _ (observed — expected)? 


i=1 nD; all cells expected 


has approximately a chi-squared distribution with k — 1 df. 


The fact that df = k — 1 is a consequence of the restriction YN, = n. Although there 
are k observed cell counts, once any k — | are known, the remaining one is uniquely 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


622 CHAPTER 14 Goodness-of-Fit Tests and Categorical Data Analysis 


determined. That is, there are only k — 1 “freely determined” cell counts, and thus 
k — 1 df. 

If npj. is substituted for np; in x’, the resulting test statistic has a chi-squared 
distribution when Hy is true. The more the observed frequencies differ from 
expected frequencies, the larger the value of y will be. Since the test statistic uti- 
lizes expected frequencies assuming that H, is true, any test statistic value larger 
than the calculated y? will be even more contradictory to H, than this calculated 
value. The implication is that the test is upper-tailed: The P-value will be the area 
under the relevant chi-squared curve to the right of the calculated x7 value. 


Null hypothesis: Hp: p; = Pio, Po = P20: ---> Pe = Pwo 


Alternative hypothesis: H,: at least one p; does not equal p,, 


= 2 k = 2 
Test statistic value: x? = 5») ee pe ea 
all cells expected i=1 "Pi 
Provided that np,. = 5 for all i, the P-value is (approximately) the area under 
the yz_, curve to the right of the calculated value of y*. If npj < 5 for at 
least one i, categories should be combined in a sensible way to correct this 
deficiency. 


Table A.7 gives chi-squared critical values y2,, that capture specified areas a under 
various chi-squared curves (analogous to what t,,, does for t curves). But because the 
tabulation is for only five small values of a, limited information about a P-value is 
available. We have therefore included another appendix table, similar to the ¢ curve 
tail areas of Table A.8, that facilitates making more precise P-value statements. 
The fact that t curves were all centered at zero allowed us to tabulate f-curve 
tail areas in a relatively compact way, with the left margin giving values ranging from 
0.0 to 4.0 on the horizontal ¢ scale and various columns displaying corresponding 
upper-tail areas for various df’s. The rightward movement of chi-squared curves as 
df increases necessitates a somewhat different type of tabulation. The left margin of 
Appendix Table A.11 displays various upper-tail areas: .100, .095, .090, .. . , .005, 
and .001. Each column of the table is for a different value of df, and the entries are 
values on the horizontal chi-squared axis that capture these corresponding tail areas. 
For example, moving down to tail area .085 and across to the 4 df column, we see that 
the area to the right of 8.18 under the 4 df chi-squared curve is .085 (see Figure 14.2). 
Capturing the same upper-tail area under the 10 df curve requires going out to 16.54. 


Sx) 
0.20 


0.15 
Chi-squared density curve for 4 df 
0.10 ra 
0.05 Shaded area = .085 
0.00 x 


Calculated y? —> 8.18 


Figure 14.2 A P-value for an upper-tailed chi-squared test 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


14.1 Goodness-of-Fit Tests When Category Probabilities Are Completely Specified 623 


Returning to the 4 df column of Table A.8, we see that the area under the yj curve 
to the right of 8.33 is .08. Thus the calculated value y? = 8.20 implies that .08 < 
P-value < .085. In this case Hy) would be rejected at significance level .10 but not at 
levels .05 or .01. The top row of the 4 df column shows that if the calculated value 
of the chi-squared variable is smaller than 7.77, the captured tail area (the P-value) 
exceeds .10. Similarly, the bottom row in this column indicates that if the calculated 
value exceeds 18.46, the tail area is smaller than .001 (P-value < .001). 


EXAMPLE 14.1 Genetics provides a rich area for application of chi-squared testing. Let’s focus on 
two different characteristics of an organism, each controlled by a single gene, and 
consider crossing a pure strain having genotype AABB with a pure strain having 
genotype aabb (capital letters denoting dominant alleles and small letters recessive 
alleles). The resulting genotype will be AaBb. If these first-generation organisms 
are then crossed among themselves (a dihybrid cross), there will be four phenotypes 
depending on whether a dominant allele of either type is present. Mendel’s laws of 
inheritance imply that these four phenotypes should have probabilities 
9/16, 3/16, 3/16, and 1/16 of arising in any given dihybrid cross. 

The article “Linkage Studies of the Tomato” (Trans. Royal Canadian 
Institute, 1931: 1-19) reports the following data on phenotypes from a dihybrid 
cross of tall cut-leaf tomatoes with dwarf potato-leaf tomatoes. There are k = 4 cate- 
gories corresponding to the four possible phenotypes, with the null hypothesis being 


.  _8-..3.._4 
igre 1G ee ig 16 
The expected cell counts are 9n/16, 3n/16, 3n/16, and n/16, and the test is based on 


k — 1 =3 df. The total sample size was n = 1611. Observed and expected counts 
are given in Table 14.2. 


Ay: Py = 


Table 14.2 Observed and Expected Cell Counts for Example 14.1 


i=1 i=2 i=3 i=4 
Tall, Tall, Dwarf, Dwarf, 
cut leaf potato leaf cut leaf potato leaf 
n; 926 288 293 104 
NP io 906.2 302.1 302.1 100.7 


The contribution to y? from the first cell is 


(ny — MPyo)” (926 — 906.2)? 
NP 19 906.2 


= .433 


Cells 2, 3, and 4 contribute .658, .274, and .108, respectively, so x7 = .433 + .658 + 
.274 + .108 = 1.473. Table A.11 shows that .10 is the area to the right of 6.25 under 
the chi-squared curve with 3 df. Therefore the area under this curve to the right of 
1.473 considerably exceeds .10. That is, P-value > .10, so Hy, cannot be rejected 
even at this rather large level of significance. The data is quite consistent with 
Mendel’s laws. (| 


Although we have developed the chi-squared test for situations in which k > 2, 
it can also be used when k = 2. The null hypothesis in this case can be stated as 
HA): P; = Po, Since the relations p, = 1 — p, and p) = 1 — p,) make the inclusion of 
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P> = Poo in Hy redundant. The alternative hypothesis is H,: p, # Pj). These hypoth- 
eses can also be tested using a two-tailed z test with test statistic 


A 


(N,/n) — Pio _ Pi Pro 


Pio(l a Po) P\0P20 
n n 


Surprisingly, the two test procedures are completely equivalent. This is because it can 
be shown that Z* = x? and (z 7)? = Xj, So the relevant tail areas (P-values) are identi- 
cal.* If the alternative is either H,: p, > Po or H,: P; < Pjo, the chi-squared test cannot 
be used. One must then revert to an upper- or lower-tailed z test. 

As is the case with all test procedures, one must be careful not to confuse sta- 
tistical significance with practical significance. A computed y? that exceeds x? ,_ | 
may be a result of a very large sample size rather than any practical differences 
between the hypothesized p,.s and true p;s. Thus if pj) = Poy = P3) = 1/3, but the 
true p;s have values .330, .340, and .330, a large value of y? is sure to arise with a 
sufficiently large n. Before rejecting H, the p;s should be examined to see whether 
they suggest a model different from that of H) from a practical point of view. 


X? When the P's Are Functions 
of Other Parameters 


Sometimes the p;s are hypothesized to depend on a smaller number of parameters 
0,,..., 6, (mm <k). Then a specific hypothesis involving the 6,’s yields specific pj's, 
which are then used in the y? test. 


EXAMPLE 14.2 In a well-known genetics article (“The Progeny in Generations F,, to F,, of 
a Cross Between a Yellow-Wrinkled and a Green-Round Seeded Pea,’ J. of 
Genetics, 1923: 255-331), the early statistician G. U. Yule analyzed data result- 
ing from crossing garden peas. The dominant alleles in the experiment were 
Y = yellow color and R = round shape, resulting in the double dominant YR. Yule 
examined 269 four-seed pods resulting from a dihybrid cross and counted the num- 
ber of YR seeds in each pod. Letting X denote the number of YRs in a randomly 
selected pod, possible X values are 0, 1, 2, 3, 4, which we identify with cells 1, 2, 3, 
4, and 5 of a rectangular table (so, e.g., a pod with X = 4 yields an observed count 
in cell 5). 

The hypothesis that the Mendelian laws are operative and that genotypes of indi- 
vidual seeds within a pod are independent of one another implies that X has a binomial 
distribution with n = 4 and 6 = 9/16. We thus wish to test Hp: Pp, = Pio, ---» Ps = Ps0» 
where 


Pi = PG — 1 YRs among 4 seeds when A) is true) 


4 \er-a —ey"?  7=1,2,3,4,5;0=— 
— _ — . —— 
i- i ? > > 3 > 16 


Yule’s data and the computations are in Table 14.3, with expected cell counts 
MPio = 269P 9. 


* The fact that (z,/.)” = Xjq is a consequence of the relationship between the standard normal distribution 


and the chi-squared distribution with 1 df; if Z ~ N(0, 1), then Z has a chi-squared distribution with v = 1. 
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Table 14.3 Observed and Expected Cell Counts for Example 14.2 


Celli 1 2 3 4 5 
YR peas/pods 0 1 2 3 4 
Observed 16 45 100 82 26 
Expected 9.86 | 50.68 | 97.75 | 83.78 | 26.93 


(observed — expected)? 
3.823.637 .052 .038 .032 


expected 


Thus x? = 3.823 + -+-- + .032 = 4.582. Appendix Table A.11 shows that because 
4.582 < 7.77, the P-value for the test exceeds .10. H) should not be rejected at any 
reasonable significance level. a 


x? When the Underlying Distribution 
Is Continuous 


We have so far assumed that the k categories are naturally defined in the context of 
the experiment under consideration. The x” test can also be used to test whether a 
sample comes from a specific underlying continuous distribution. Let X denote the 
variable being sampled and suppose the hypothesized pdf of X is fo(x). As in the 
construction of a frequency distribution in Chapter 1, subdivide the measurement 
scale of X into k intervals [dp, a)), [a), ay), ... , [a,—4, a,), where the interval [a,_ ,, a;) 
includes the value a,;_, but not a;. The cell probabilities specified by H, are then 


Pio = Play_-,) =X <a) =|" 409 dx 


The cells should be chosen so that np,, = 5 fori = 1,..., k. Often they are selected 
so that the np,)s are equal. 


EXAMPLE 14.3 To see whether the time of onset of labor among expectant mothers is uniformly 
distributed throughout a 24-hour day, we can divide a day into k periods, each of 
length 24/k. The null hypothesis states that f(x) is the uniform pdf on the interval [0, 
24], so that p,, = 1/k. The article “The Hour of Birth” (British J. of Preventive and 
Social Medicine, 1953: 43-59) reports on 1186 onset times, which were categorized 
into k = 24 1-hour intervals beginning at midnight, resulting in cell counts of 52, 73, 
89, 88, 68, 47, 58, 47, 48, 53, 47, 34, 21, 31, 40, 24, 37, 31, 47, 34, 36, 44, 78, and 
59. Each expected cell count is 1186 - 1/24 = 49.42, and the resulting value of y? 
is 162.77. Statistical software gives P-value = .000, so H, is resoundingly rejected at 
any sensible significance level. Generally speaking, it appears that labor is much more 
likely to commence very late at night than during normal waking hours. a 


For testing whether a sample comes from a specific normal distribution, the 
fundamental parameters are 6, = ys and 6, = o, and each pj, will be a function of 
these parameters. 


EXAMPLE 14.4 At a certain university, final exams are supposed to last 2 hours. The psychology 
department constructed a departmental final for an elementary course that was 
believed to satisfy the following criteria: (1) actual time taken to complete the exam is 
normally distributed, (2) ~ = 100 min, and (3) exactly 90% of all students will finish 
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within the 2-hour period. To see whether this is actually the case, 120 students were 
randomly selected, and their completion times recorded. It was decided that k = 8 
intervals should be used. The criteria imply that the 90th percentile of the completion 
time distribution is w + 1.280 = 120. Since w = 100, this implies that 0 = 15.63. 
The eight intervals that divide the standard normal scale into eight equally likely 
segments are [0, .32), [.32, .675), [.675, 1.15), and [1.15, ©), and their four counter- 
parts are on the other side of 0. For uw = 100 and o = 15.63, these intervals become 
[100, 105), [105, 110.55), [110.55, 117.97), and [117.97, ©). Thus pj» = 1/8 = .125 
(i = 1,..., 8), so each expected cell count is np, = 120(.125) = 15. The observed 
cell counts were 21, 17, 12, 16, 10, 15, 19, and 10, resulting in a yx? of 7.73. The 8 df 
column of Table A.11 shows that P-value > .10, so there is no evidence for concluding 
that the criteria have not been met. a 


EXERCISES Section 14.1 (1-11) 


What conclusion would be appropriate for an upper- 
tailed chi-squared test in each of the following situations? 


a. a= .05, df = 4, x? = 12.25 


b. a =.01, df = 3, x? = 8.54 
c. a=.10, df = 2, x? = 4.36 
d. a=.01, k=6, x? = 10.20 


The article “Racial Stereotypes in Children’s Television 
Commercials” (J. of Adver. Res., 2008: 80-93) reported 
the following frequencies with which ethnic characters 
appeared in recorded commercials that aired on 
Philadelphia television stations. 


African 
Ethnicity: American Asian Caucasian Hispanic 
Frequency: 57 11 330 6 


The 2000 census proportions for these four ethnic groups 
are .177, .032, .734, and .057, respectively. Does the data 
suggest that the proportions in commercials are different 
from the census proportions? Carry out a test of appro- 
priate hypotheses using a significance level of .01. 


It is hypothesized that when homing pigeons are disori- 
ented in a certain manner, they will exhibit no preference 
for any direction of flight after takeoff (so that the direc- 
tion X should be uniformly distributed on the interval 
from 0° to 360°). To test this, 120 pigeons are disori- 
ented, let loose, and the direction of flight of each is 
recorded; the resulting data follows. Use the chi-squared 
test at level .10 to see whether the data supports the 
hypothesis. 


Direction | 0—-<45° 45-—<90° 90—<135° 


Frequency | 12 16 17 
Direction | 135—<180° 180—<225° 225—<270° 


Frequency | 15 13 20 
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Direction | 270—<315° 315-—<360° 


Frequency | 17 10 


The article “Application of Methods for Central 
Statistical Monitoring in Clinical Trials” (Clinical 
Trials, 2013: 783-806) made a strong case for central sta- 
tistical monitoring as an alternative to more expensive 
onsite data verification. It suggested various methods for 
identifying data characteristics such as outliers, incorrect 
dates, anomalous data patterns, unusual correlation struc- 
tures, and digit preference. Exercise 3.21 of this book 
introduced Benford’s Law, which gives a probability 
model for the first significant digit in many large data sets: 
D(x) = log, ((x + 1)/x) for x = 1, 2, ... , 9. The cited article 
gave the following frequencies for the first significant digit 
in a variety of variables whose values were determined in 
one particular clinical trial: 


Digit 1 2 3 4 

Freq. 342 180 164 155 

Digit 5 6 7 8 9 
Freq. 86 65 54 47 56 


Carry out a test of hypotheses to see whether or not these 
frequencies are consistent with Benford’s Law (the cited 
article gave P-value information). 

An information-retrieval system has ten storage loca- 
tions. Information has been stored with the expectation 
that the long-run proportion of requests for location i is 
given by p, = (5.5 — |i — 5.5])/30. A sample of 200 
retrieval requests gave the following frequencies for 
locations 1-10, respectively: 4, 15, 23, 25, 38, 31, 32, 14, 
10, and 8. Use a chi-squared test at significance level .10 
to decide whether the data is consistent with the a priori 
proportions. 


The article “The Gap Between Wine Expert Ratings 
and Consumer Preferences” (Intl. J. of Wine Business 


Res., 2008: 335-351) studied differences between 
expert and consumer ratings by considering medal rat- 
ings for wines, which could be gold (G), silver (S), or 
bronze (B). Three categories were then established: 
1. Rating is the same [(G,G), (B,B), (S,S)]; 2. Rating 
differs by one medal [(G,S), (S,G), (S,B), (B,S)]; and 3. 
Rating differs by two medals [(G,B), (B,G)]. The 
observed frequencies for these three categories were 69, 
102, and 45, respectively. On the hypothesis of equally 
likely expert ratings and consumer ratings being 
assigned completely by chance, each of the nine medal 
pairs has probability 1/9. Carry out an appropriate chi- 
squared test using a significance level of .10. 


Criminologists have long debated whether there is a rela- 
tionship between weather conditions and the incidence of 
violent crime. The author of the article “Is There a 
Season for Homicide?” (Criminology, 1988: 287-296) 
classified 1361 homicides according to season, resulting 
in the accompanying data. Test the null hypothesis of 
equal proportions using a = .01. 


Winter Spring Summer Fall 


328 334 372 327 


The article “Psychiatric and Alcoholic Admissions 
Do Not Occur Disproportionately Close to Patients’ 
Birthdays” (Psychological Reports, 1992: 944-946) 
focuses on the existence of any relationship between 
the date of patient admission for treatment of alcohol- 
ism and the patient’s birthday. Assuming a 365-day 
year (i.e., excluding leap year), in the absence of any 
relation, a patient’s admission date is equally likely to 
be any one of the 365 possible days. The investigators 
established four different admission categories: (1) 
within 7 days of birthday; (2) between 8 and 30 days, 
inclusive, from the birthday; (3) between 31 and 90 
days, inclusive, from the birthday; and (4) more than 
90 days from the birthday. A sample of 200 patients 
gave observed frequencies of 11, 24, 69, and 96 for 
categories 1, 2, 3, and 4, respectively. State and test the 
relevant hypotheses using a significance level of .01. 


The response time of a computer system to a request for 
a certain type of information is hypothesized to have an 
exponential distribution with parameter A = | sec (so if 
X = response time, the pdf of X under H, is fo(x) = e* 
for x = 0). 


10. 


11. 
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a. If you had observed X,, X,, ... ,X,, and wanted to use 
the chi-squared test with five class intervals having 
equal probability under Hy, what would be the result- 
ing class intervals? 

b. Carry out the chi-squared test using the following data 
resulting from a random sample of 40 response times: 


10 99 1.14 1.26 3.24 12.26.80 
79 1.16 1.76 41 59 27 2.22 .66 
710 2.21 .68 43 11 46 69.38 
91 55 81 251 2:77 16 L111 .02 
2.13 19 1.21 1.13 2.93 2.14 34 44 


a. Show that another expression for the chi-squared 
statistic is 
k N2 
—_ aS 
. > NPio . 
Why is it more efficient to compute x7 using this formula? 
b. When the null hypothesis is Hp: p; = py =-*: = 
DP, = 1/k (i. Pi = 1/k for all i), how does the for- 
mula of part (a) simplify? Use the simplified expres- 
sion to calculate 7° for the pigeon/direction data in 
Exercise 4. 


a. Having obtained a random sample from a population, 
you wish to use a chi-squared test to decide whether 
the population distribution is standard normal. If you 
base the test on six class intervals having equal prob- 
ability under H), what should be the class intervals? 

b. If you wish to use a chi-squared test to test H,: the 
population distribution is normal with p =.5, 
o = .002 and the test is to be based on six equiprob- 
able (under H)) class intervals, what should be these 
intervals? 

c. Use the chi-squared test with the intervals of part (b) 
to decide, based on the following 45 bolt diameters, 
whether bolt diameter is a normally distributed vari- 
able with = .5 in., 0 = .002 in. 


4974 4976 .4991 5014 5008  .4993 
4994 =.5010 =.4997_ ~—.4993, 5013 ~—-.5000 
5017 4984 4967-5028) 49755013 
4972 5047) 5069-4977 A961 4987 
4990 =—.4974 5008 = 5000 = 4967 ~—.4977 
4992 5007 ~— 4975. 4998 5000 5008 
5021 =.4959 5015. .5012)— 5056 _—Cw4 991 
5006 = .4987 —.4968 


14.2 Goodness-of-Fit Tests for Composite Hypotheses 


In the previous section, we presented a goodness-of-fit test based on a y? statistic 


for deciding between H,: p,; = Pio, -- 


-> Py = Py and the alternative H, stating that 


H, is not true. The null hypothesis was a simple hypothesis in the sense that each 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


628 CHAPTER 14 Goodness-of-Fit Tests and Categorical Data Analysis 


Pio Was a specified number, so that the expected cell counts when H, was true were 
uniquely determined numbers. 

In many situations, there are k naturally occurring categories, but H, states 
only that the ps are functions of other parameters 6,, ... , 6, without specifying the 
values of these 6’s. For example, a population may be in equilibrium with respect to 
proportions of the three genotypes AA, Aa, and aa. With p,, p,, and p, denoting these 
proportions (probabilities), one may wish to test 


Hy: p, = 8, p, = 20(1 — 6), p, = (1 — 6)? (14.1) 


where 6 represents the proportion of gene A in the population. This hypothesis is 
composite because knowing that Hp is true does not uniquely determine the cell prob- 
abilities and expected cell counts but only their general form. To carry out a x? test, 
the unknown 6,’s must first be estimated. 

Similarly, we may be interested in testing to see whether a sample came from 
a particular family of distributions without specifying any particular member of the 
family. To use the y* test to see whether the distribution is Poisson, for example, 
the parameter jz must be estimated. In addition, because there are actually an infinite 
number of possible values of a Poisson variable, these values must be grouped so that 
there are a finite number of cells. If H, states that the underlying distribution is normal, 
use of a x? test must be preceded by a choice of cells and estimation of pz and o. 


xy? When Parameters Are Estimated 


As before, k will denote the number of categories or cells, and p; will denote the 
probability of an observation falling in the ith cell. The null hypothesis now states 
that each p; is a function of a small number of parameters 6,,...,0,, with the 6,’s 
otherwise unspecified: 


Hy: py = 7,(0),.--, Py = 7,(8) where 6 = (6,,..., 9,,) 

H,: the hypothesis H, is not true (14.2) 
For example, for H, of (14.1), m= 1 (there is only one 6), 7,(0) = 67, 
(0) = 20(1 — 6), and 77,(@) = (1 — @)?. 

In the case k = 2, there is really only a single rv, N, (since N, + N, =n), 
which has a binomial distribution. The joint probability that VN, = n, and N, =n, 
is then 

P(N, = 1, Ny = ny) = (n,)% ‘py x pit: pp 


where p, + p, = landn, + n, = n. For general k, the joint distribution of Nj,..., N; 
is the multinomial distribution (Section 5.1) with 


P(N, = ny,...,N, = nj) « pi + pei Di (14.3) 
When #1 is true, (14.3) becomes 
P(N, = ny,...,N, = 1,) « [7,(0)]" >> [77,(0)]" (14.4) 


To apply a chi-squared test, 8 = (0,,..., 0,,) must be estimated. 


METHOD OF ESTIMATION Let n,, 15,...,, denote the observed values of N,,..., N,. Then hee 6, are 
those values of the 6,’s that maximize (14.4). 


The resulting estimators 6, ae x) are the maximum likelihood estimators of 


m 
0,,..., 9,,; this principle of estimation was discussed in Section 6.2. 


m? 
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EXAMPLE 14.5 In humans there is a blood group, the MN group, that is composed of individuals 
having one of the three blood types M, MN, and N. Type is determined by two alleles, 
and there is no dominance, so the three possible genotypes give rise to three pheno- 
types. A population consisting of individuals in the MN group is in equilibrium if 

P(M) = p, = @ 
P(MN) = p, = 20(1 — 6) 
P(N) = p, = (1 — 0) 
for some 6. Suppose a sample from such a population yielded the results shown in 
Table 14.4. 


Table 14.4 Observed Counts for Example 14.5 


Type M MN N 
Observed | 125 | 225 | 150 | n= 500 


Then 
[77,(0)"[77,(6)*[77,(0)]"» = (?)1"[20C1 — 8)!" — 87) 
= Qin. Q2mtm. el = Q)r2+ 2ns 


Maximizing this with respect to @ (or, equivalently, maximizing the natural loga- 
rithm of this quantity, which is easier to differentiate) yields 


~ 2n, + ny _ 2n, + ny 
[(2n, + n,) + (n, + 2n;)] 2n 
With n, = 125 and n, = 225, 6 = 475/1000 = .475. a 


Once 0 = (6,,...,9,,) has been estimated by A= (6, ee 6,)s the estimated 
expected cell counts are the n77,(@)’s. These are now used in place of the np,,.’s of 
Section 14.1 to specify a x statistic. 


THEOREM Under general “regularity” conditions on 6,, ..., 6, and the 7,()’s, if 0,,..., 0, 


are estimated by the method of maximum likelihood as described previously 
and n is large, 


(observed — estimated expected)? aN aces n7(0)/ 
oe, => 5 
i=l n17(0) 
has approximately a chi-squared distribution with k — 1 — m df when H, of 
(14.2) is true. The P-value is therefore (roughly) the area under the x7_,_», 


curve to the right of the calculated x’. In practice, the test can be used if 
ni(@) = 5 for every i. 


Reis estimated expected 


Notice that the number of degrees of freedom is reduced by the number of 6,’s estimated. 


EXAMPLE 14.6 With = 475 and n = 500, the estimated expected cell counts are ntr,(6) =500(6) = 
(Example 14.5 112.81, n7,(0) = (500)(2)(.475)(1.475) = 249.38, and nt3(6) =500 —112.81- 


continued) 249.38 = 137.81. Then 
; (125 — 112.81)? (225 — 249.38)? (150 — 137.81)? 
v= + + = 4.78 
112.81 249.38 137.81 
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Appendix Table A.11 shows that for df = 3 — 1 — 1 = 1, P-value ~ .029. Therefore 
Hy is rejected at significance level .05 (but not at level .01). o 


EXAMPLE 14.7 Consider a series of games between two teams, I and II, that terminates as soon as 
one team has won four games (with no possibility of a tie). A simple probability 
model for such a series assumes that outcomes of successive games are independent 
and that the probability of team I winning any particular game is a constant 0. We 
arbitrarily designate I the better team, so that 9 = .5. Any particular series can then 
terminate after 4, 5, 6, or 7 games. Let 7,(0), 77,(0), 77,(@), 774(@) denote the prob- 
ability of termination in 4, 5, 6, and 7 games, respectively. Then 

a,(0) = P(I wins in 4 games) + P(II wins in 4 games) 
= 64 + (1 — 6)* 
a,(9) = P(I wins 3 of the first 4 and the fifth) 
+P(I loses 3 of the first 4 and the fifth) 


= (3) ~ 6): 6+ (ian ~ 6) + (1-6) 


40(1 — 0)[@ + (1 — 6) ] 
773(0) = 1002(1 — 0)?[62 + (1 — 8)?] 
77,(0) = 2003(1 — 6)3 


The article ‘““Seven-Game Series in Sports” by Groeneveld and Meeden 
(Mathematics Magazine, 1975: 187-192) tested the fit of this model to results of 
National Hockey League playoffs during the period 1943-1967 (when league mem- 
bership was stable). The data appears in Table 14.5. 


Table 14.5 Observed and Expected Counts for the Simple Model 


Cell 1 2 3 4 
Number of games played 4 5 6 7 
Observed frequency 15 26 24 18 n= 83 


Estimated expected frequency 16.351 24.153 23.240 19.256 


The estimated expected cell counts are 8377,(0), where 6 is the value of 6 that maximizes 


{6 + (1 — @)4}5 - (40(1 — 6)[68 + (1 — 6)3]}29 
- {10621 — 6)?[62 + (1 — 6)7]}% - {20051 — @)3}!® =~ (14.5) 


Standard calculus methods fail to yield a nice formula for the maximizing value 6, 
so it must be computed using numerical methods. The result is 6 = .654, from which 
(0) and the estimated expected cell counts are computed. The computed value of 
x7 is .360. According to the k — 1 — m =4 — 1 — 1 = 2 df column of Table A.11, 
P-value > .10. There is thus no reason to reject the simple model as applied to the 
NHL playoff series. 

The cited article also considered World Series data for the period 1903-1973. 
For the simple model, x? = 5.97; Table A.11 yields P-value ~ .05. At significance 
level .10, the model is of doubtful validity. The suggested reason for this is that 


P(series lasts six games| series lasts at least six games) = .5 (14.6) 
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whereas of the 38 series that actually lasted at least six games, only 13 lasted exactly 
six. The following alternative model is then introduced: 

7(0,,0,) = 0; + (1 — 0,)4 

77(0,, 6) = 40,11 — 6,) [63 + 1 — 6,)3] 

7(0,, 05) = 1007(1 — 0,)76, 

77,(0;, 85) = 1007(1 — 6,)7(1 — 45) 
The first two 7r;s are identical to the simple model, whereas 0, is the conditional 
probability of (14.6) (which can now be any number between 0 “anid 1). The values 
of 6, and 6, that maximize the expression analogous to expression (14.5) are deter- 
none nisnerically as 6, = .614, 6, = .342. A summary appears in Table 14.6, and 


x? = .384. Since two parameters are estimated, df = k — 1 — m= 1. The P-value 
considerably exceeds .10, indicating a good fit of the data to this new model. 


Table 14.6 Observed and Expected Counts for the More Complex Model 


Number of games played 4 5 6 7 
Observed frequency 2: 16 13 25 
Estimated expected frequency 10.85 18.08 12.68 24.39 


One of the conditions on the 6,’s in the theorem is that they be functionally 
independent of one another. That is, no single 0; can be determined from the values of 
other 6;’s, so that m is the number of functionally independent parameters estimated. 
A general rule of thumb for degrees of freedom in a chi-squared test is the following. 


determined cell counts 


number of freely number of independent 
NC df — = 


parameters estimated 


This rule will be used in connection with several different chi-squared tests in the 
next section. 


Goodness of Fit for Discrete Distributions 


Many experiments involve observing a random sample X), X,,...,X, from some 
discrete distribution. One may then wish to investigate whether the underlying dis- 
tribution is a member of a particular family, such as the Poisson or negative binomial 
family. In the case of both a Poisson and a negative binomial distribution, the set 
of possible values is infinite, so the values must be grouped into k subsets before 
a chi-squared test can be used. The groupings should be done so that the expected 
frequency in each cell (group) is at least 5. The last cell will then correspond to 
X values of c,c + 1,c + 2,... for some value c. 

This grouping can considerably complicate the computation of the 6,’s and 
estimated expected cell counts. This is because the theorem requires that the 6,’s be 
obtained from the cell counts N,,..., N, rather than the sample values X,,..., X,,. 

EXAMPLE 14.8 Table 14.7 presents count data on the number of Larrea divaricata plants found in each 
of 48 sampling quadrats, as reported in the article “Some Sampling Characteristics 
of Plants and Arthropods of the Arizona Desert” (Ecology, 1962: 567-571). 
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Table 14.7 Observed Counts for Example 14.8 


Cell 1 2 3 4 5 
Number of plants 0 1 2 3 24 


Frequency 9 9 | 10 14 6 


The article’s author fit a Poisson distribution to the data. Let w denote the 
Poisson parameter and suppose for the moment that the six counts in cell 5 were 
actually 4, 4, 5, 5, 6, 6. Then denoting sample values by x,,..., x4g, nine of the x;’s 
were OQ, nine were 1, and so on. The likelihood of the observed sample is 


The value of x for which this is maximized is & = =x,/n = 101/48 = 2.10 (the 
value reported in the article). 

However, the {2 required for x7 is obtained by maximizing Expression (14.4) 
rather than the likelihood of the full sample. The cell probabilities are 


eH i-1 


_ ee _ 
T(jL) = G=-pD! oD i=1,2,3,4 
3: e Hu! 
a5(h) = 1—- > i! 
i=o | 


so the right-hand side of (14.4) becomes 


e Hu? 9 eu! 9 e Hu? 10 e Hw 14 3 eeu 6 
1 
0! 1! 2! 3! > i! 
There is no nice formula for (1, the maximizing value of p, in this latter expression, 
so it must be obtained numerically. i] 


Because the parameter estimates are usually more difficult to compute from 
the grouped data than from the full sample, they are typically computed using this 
latter method. If these “full” estimators are used in the chi-squared statistic, the dis- 
tribution of the statistic when Ho is true is quite complicated, so the actual P-value 
cannot be determined. However, the following result usually enables us to reach a 
conclusion at the desired significance level a. 


THEOREM Let 6, ae On be the maximum likelihood estimators of 0,, ..., 0,, based on the 
full sample X,, ..., X,,, and let x? denote the statistic based on these estimators. 
Also let 


P, = the P-value for an upper-tailed chi-squared test based on k — 1 df 
P, = the P-value for an upper-tailed chi-squared test based on k — 1 — m df 


Then it can be shown that 


P, = P-value = P, (14.7) 
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That is, the P-value for the test under consideration is sandwiched in between the 
P-values for two “pure” upper-tailed chi-squared tests based on different df’s. The 
test procedure implied by (14.7) has the unusual feature that under some circum- 
stances judgment must be withheld until more data is available. 


Select a significance level a. Then 

If a = P,, do not reject H, 

nice =P reject fay, (14.8) 
If P;} <a < P,, withhold judgment 


Suppose, for example, that k = 6, m = 2, and a = .05. The two relevant df’s are 6 — 
1 =5and6 — | — 2 = 3. Then if y* = 7.0, Table A.11 shows that the P-value for a 
3 df test is about .07 and the P-value for a 5 df test exceeds .10. Therefore we would 
not be able to reject H, because .05 is at most the smaller of the two pure chi-squared 
P-values. If, however, x? = 15, then the 3 df P-value is roughly .002 and the 5 df 
P-value is approximately .01. Because .05 is at least the larger of these pure P-values, 
we are given license to reject Hy. Only if .05 lies between the two pure chi-squared 
P-values would we not be able to reach a conclusion. 


EXAMPLE 14.9 Using 2 = 2.10, the estimated expected cell counts are computed from n7,(j2), 
(Example 14.8 where n = 48. For example, 


continued) ~2.1(9.1)0 
nm (ji) = 48+ © - )” = (48)(e21) = 5.88 
Similarly, n77,(f) = 12.34, na,() = 12.96, na,(f2) = 9.07, and ni.(w) = 48 — 
5.88 — -+- — 9.07 = 7.75. Then 
9 — 5.88)? 6 — 7.75)? 
ts OBEY ) ee eee = 6.31 
5.88 75 


The relevant dfs are 5 — | = 4 and 5 — 2 = 3. Then Table A.11 shows that the P-value 
for a 3 df test is about .0955 and that for a 4 df test exceeds .10. Therefore at significance 
level .05, H cannot be rejected because the P-exceeds .0955 and therefore certainly 
exceeds .05. At this level, it is plausible that the actual distribution is Poisson. However, 
if the selected significance level were instead .10, we’d be in the inconclusive situation 
because the P-value could be (slightly) smaller than .10 or larger than .10. a 


Sometimes even the maximum likelihood estimates based on the full sample 
are quite difficult to compute. This is the case, for example, for the two-parameter 
(generalized) negative binomial distribution. In such situations, method-of-moments 
estimates are often used, though it is not known to what extent the use of moments 
estimators affects the null distribution of y7. 


Goodness of Fit for Continuous Distributions 


The chi-squared test can also be used to test whether the sample comes from a speci- 
fied family of continuous distributions, such as the exponential family or the normal 
family. The choice of cells (class intervals) is even more arbitrary in the continu- 
ous case than in the discrete case. To ensure that the chi-squared test is valid, the 
cells should be chosen independently of the sample observations. Once the cells are 
chosen, it is almost always quite difficult to estimate unspecified parameters (such 
as pw and a in the normal case) from the observed cell counts, so instead mle’s based 
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on the full sample are computed. The test procedure is again specified by (14.7) 
and (14.8). 


EXAMPLE 14.10 The article ‘Class Start Times, Sleep, and Academic Performance in College: A 
Path Analysis” (Chronobiology International, 2012: 318-335) reported on a study in 
which students were surveyed about various aspects of sleep behavior during a particular 
two-week period. Here is data on average sleep time per day (h) for 100 of the students: 


5.75 5.93 5.96 600 6.19 636 636 643 6.50 6.51 
651 657 669 678 693 7.04 705 7.05 7.11 7.18 
7.21 7.25 7.26 7.30 7.32 7.39 #740 £743 7.43 7.50 
7.50 7.52 753 7.54 7.57 7.60 7.61 7.63 7.64 7.64 
7.64 7.67 7.71 TAS VAS TAQ TBI 7.83 7.83 7.84 
7.86 7.86 7.87 7.88 7.89 7.93 7.96 7.98 7.99 8.00 
8.00 8.04 8.07 8.11 8.17 8.18 8.18 8.20 8.21 8.21 
8.29 8.29 8.43 8.49 849 852 854 859 8.61 8.68 
8.71 8.71 8.75 8.79 8.81 8.82 8.88 8.89 8.93 9.00 
905 915 919 9.25 932 980 985 9.87 9.96 10.62 


Is it reasonable to assume that the population distribution of average sleep time is 
at least approximately normal? The histogram in Figure 14.3 is not persuasive. So 
let’s carry out a chi-squared test of the null hypothesis that the distribution is normal. 


25 


Frequency 


LELEL TT ber 


6 7 8 9 10 
Average sleep time 


Figure 14.3 Histogram of the sleep time data from Example 14.10 


Suppose that prior to sampling, it was believed that plausible values of ~z and o were 
8 and 1, respectively. The eight equiprobable class intervals for the standard normal 
distribution (each with probability .125) are (—~%, —1.15), [—1.15, —.67), [—.67, 
—.32), [—.32, 0), [0, .32), [.32, .67), [.67, .1.15), and [1.15, ©), with each endpoint 
also giving the distance in standard deviations from the mean for any other normal 
distributions. For w = 8 and o = 1, these intervals transform to (—, 6.85), [6.85, 
7.33), [7.33, 7.68), [7.68, 8.00), [8.00, 8.32), [8.32, 8.67), [8.67, 9.15), and [9.15, 2). 

To obtain the estimated cell probabilities 77,((1, &),..., 7(4, ©), we first need 
the mle’s jf and G. In Chapter 6, the mle of o was shown to be [2(x, — x)?/n]!/? 
(rather than s), so with s = .9481, 


Goma | ~ ps2]? 
fp =x = 7.876 -|2 -|¢ *| = 9433 


n nN 
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Each 7,(j1, G) is then the probability that a normal rv X with mean 7.876 and stand- 


ard deviation .9433 falls in the ith class interval. For example, 


Ty(fL, a) _ 
so nt,(fi, 6) = 100(.1431) = 


are shown in Table 14.8. 


Table 14.8 Observed and Expected Counts for Example 14.10 


P(6.85 < X < 7.33) = P(-1.09 < Z< —.58) = 


14.2. Goodness-of-Fit Tests for Composite Hypotheses 


1431 


635 


14.31. Observed and estimated expected cell counts 


Cell (—, 6.85) [6.85, 7.33) [7.33, 7.68) [7.68, 8.00) 
Observed 14 11 17 9 
Estimated expected 13.79 14.31 13.58 13.49 
Cell [8.00, 8.32) [8.32, 8.67) [8.67, 9.15) [9.15, «) 
Observed 11 8 12 8 
Estimated expected 12.91 11.87 11.20 8.85 


EXAMPLE 14.11 


The computed value of x” is 5.56. With k = 8 cells and m = 2 parameters estimated, 
the 7 df and 5 df columns of Table A.11 show that both pure chi-squared P-values 
exceed .10. Therefore our P-value certainly exceeds any reasonable a, indicating that 
the null hypothesis of population normality cannot be rejected. The evidence from the 
entire sample of n = 253 students is somewhat less supportive of H. The P-value from 
the special test for normality described in the next subsection is .086. 4 


The article “Some Studies on Tuft Weight Distribution in the Opening Room” 
(Textile Research J., 1976: 567-573) reports the accompanying data on the distri- 
bution of output tuft weight X (mg) of cotton fibers for the input weight x) = 70. 


Interval 0-8 8-16 16-24 24-32 32-40 40-48 48-56 56-64 64-70 
Observed frequency | 20 8 7 1 2 1 0 1 0 
Expected frequency | 18.0 | 9.9 5.5 3.0 1.8 9 SJ 3 ol 


The authors postulated a truncated exponential distribution: 


Ae 


—eé 


Hy: f@) = 


0O=x=xX, 


The mean of this distribution is 
Xe Mo 
Axo 


Xo 1 
=| 4@de=—-- 
bo i of (x) eat esr 
The parameter A was estimated by replacing w by x = 13.086 and solving the resul- 
ting equation to obtain d = .0742 (so A is a method-of-moments estimate and not 
an mle). Then with A replacing A in f(x), the estimated expected cell frequencies as 
displayed previously are computed as 


40(e 4-1 = ea) 


1 —e7*% 


407,(A) = 40P(a,_, <X <a) = ‘o| fax)dx = 


1; 
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where [q;_,, @,) is the ith class interval. To obtain expected cell counts of at least 5, 
the last six cells are combined to yield observed counts of 20, 8, 7, 5 and expected 
counts of 18.0, 9.9, 5.5, 6.6. The computed value of chi-squared is then x? = 1.34 
with a corresponding P-value that exceeds .10. Therefore H, cannot be rejected at 
significance level .05, so the truncated exponential model provides a good fit. M& 


A Special Test for Normality 


Probability plots were introduced in Section 4.6 as an informal method for assessing 
the plausibility of any specified population distribution as the one from which the 
given sample was selected. The straighter the probability plot, the more plausible 
is the distribution on which the plot is based. A normal probability plot is used for 
checking whether any member of the normal distribution family is plausible. Let’s 
denote the sample x;'s when ordered from smallest to largest by x,(1), X(),.--, X(n): 
Then the plot suggested for checking normality was a plot of the points (x,), y;), 
where y, = ®-'((i — .5)/n). 

A quantitative measure of the extent to which points cluster about a straight line is 
the sample correlation coefficient r introduced in Chapter 12. Consider calculating r for 
the n pairs (x(1), y;),---5 (%() ¥,). The y;’s here are not observed values in a random sam- 
ple from a y population, so properties of this r are quite different from those described in 
Section 12.5. However, it is true that the more r deviates from 1, the less the probability 
plot resembles a straight line (remember that a probability plot must slope upward). 
This implies that the test is lower-tailed: The P-value is the area under the density curve 
of R (the random variable whose computed value is r) when H) is true to the left of r. 
Unfortunately the distribution of R is very complicated. The developers of the Minitab 
software have provided critical values that capture lower-tail areas of .10, .05, and .01 
for various sample sizes, which are included in our Table A.12. These critical values are 
based on a slightly different definition of the y,’s than that given previously. 

Minitab will also construct a normal probability plot based on these y,’s. The 
plot will be almost identical in appearance to that based on the previous y,’s. When 
there are several tied x's, Minitab computes r by using the average of the corre- 
sponding y,’s as the second number in each pair. 


Let y, = ©" '[(i — .375)/(n + .25)], and compute the sample correlation coef- 
ficient r for the n pairs (x(1), ¥)),-++» (Xq,» Y,). The Ryan-Joiner test of 


H): the population distribution is normal 
versus 
H,: the population distribution is not normal 


uses test statistic R (obtained by replacing the x,)’s in r by X,j’s). If r coincides 
with a critical value in Table A.12, we have an exact P-value (.10, .05, or .01). 
Otherwise we are able to say that either P-value > .10, .05 < P-value < .10, 
.01 < P-value < .05, or P-value < .01. 


EXAMPLE 14.12 The following sample of n = 20 observations on dielectric breakdown voltage of a 
piece of epoxy resin first appeared in Exercise 4.89. 


y; | 1.871 —1.404 —1.127 —.917 —.742 —.587 —.446 —.313 —.186 —.062 
(i) | 24.46 25.61 26.25 26.42 26.66 27.15 27.31 27.54 27.74 27.94 
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446.587 
28.28 28.49 28.50 28.87 29.11 


.062 742. 917) 1.127) 1.404 1.871 


27.98 


.186 
28.04 


ye | 313 


Xi) | 


29.13 29.50 30.88 


We asked Minitab to carry out the Ryan-Joiner test, and the result appears in 
Figure 14.4. The test statistic value is r = .9881, and Appendix Table A.12 gives 
.9600 as the critical value that captures lower-tail area .10 under the r sampling 
distribution curve when n = 20 and the underlying distribution is actually normal. 
Because .9881 > .9600, the P-value exceeds .10. Therefore the null hypothesis 


of normality cannot be rejected even for a significance level as large as .10. 


999 
99 
95 
80 
50 
.20 
05 
O01 

001 


Probability 


Average: 27.793 
Std Dev: 1.46186 R: 


Normal Probability Plot 


T T 
242 25 


N of data: 20 


Figure 14.4 Minitab output from the Ryan-Joiner test for the data of Example 14.12 


I ea 
2 262 272 282 292 302 312 


dielvolt 


Wtest for normality 
0.9881 
pvalue (approx): > 0.1000 


EXERCISES Section 14.2 (12-23) 


12. 


Consider a large population of families in which each 
family has exactly three children. If the genders of the 
three children in any family are independent of one 
another, the number of male children in a randomly 
selected family will have a binomial distribution based 
on three trials. 
a. Suppose a random sample of 160 families yields the 
following results. Test the relevant hypotheses by 
proceeding as in Example 14.5. 


Number of 
Male Children 0 1 2 3 
Frequency | 14 66 64 16 


b. Suppose a random sample of families in a nonhuman 
population resulted in observed frequencies of 15, 
20, 12, and 3, respectively. Would the chi-squared 
test be based on the same number of degrees of free- 
dom as the test in part (a)? Explain. 


13. 


14. 


A study of sterility in the fruit fly (“Hybrid Dysgenesis in 
Drosophila melanogaster: The Biology of Female and 
Male Sterility,’ Genetics, 1979: 161-174) reports the fol- 
lowing data on the number of ovaries developed by each 
female fly in a sample of size 1388. One model for unilat- 
eral sterility states that each ovary develops with some 
probability p independently of the other ovary. Test the fit 
of this model using x?. 


x = Number of 
Ovaries Developed 0 1 2 


| 1212 118 58 


Observed Count 


The article “Feeding Ecology of the Red-Eyed Vireo and 
Associated Foliage-Gleaning Birds” (Ecological 
Monographs, 1971: 129-152) presents the accompanying 
data on the variable X = the number of hops before the 
first flight and preceded by a flight. The author then 
proposed and fit a geometric probability distribution 
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[p(x) = P(X = x) = p*! + q for x = 1, 2,..., where g = 
1 — p] to the data. The total sample size was n = 130. 


x [1 23456789 10 ll 12 
Number 
of Timesx |48 31 20965 421 1 2 #1 
Observed 
a. The likelihood is (p"7!+ q):---(p~!+q)= 


p=" - q". Show that the mle of p is given by p = 
(x; — n)/=x,, and compute p for the given data. 

b. Estimate the expected cell counts using p of part (a) 
[expected cell counts = n- (p)*~! + gforx = 1, 2,...], 
and test the fit of the model using a y? test by combin- 
ing the counts for x = 7, 8,..., and 12 into one cell 
(x = 7). 


15. A certain type of flashlight is sold with the four batteries 
included. A random sample of 150 flashlights is obtained, 
and the number of defective batteries in each is deter- 
mined, resulting in the following data: 


Number Defective | 0 1 2 3 4 
1/26 #351 4«047”—16-Ss«*10 


Frequency 


Let X be the number of defective batteries in a ran- 
domly selected flashlight. Test the null hypothesis 
that the distribution of X is Bin(4, 0). That is, with 
p; = P(i defectives), test 
4\ 3 . 
Ay: pp =|.) HU — a)! i= 0, 1,2, 3,4 
i 
[Hint: To obtain the mle of 0, write the likelihood (the 
function to be maximized) as 6“(1 — @)", where the 
exponents u and v are linear functions of the cell counts. 


Then take the natural log, differentiate with respect to 6, 
equate the result to 0, and solve for 6.] 


16. Let X = the number of adult police contacts for a ran- 
domly selected individual who previously had at least 
one such contact prior to age 18. The following frequen- 
cies were calculated from information given in the article 
“Examining the Prevalence of Criminal Desistance” 
(Criminology, 2003: 423-448); our sample size differs 
slightly from what was reported because of rounding. 


x | 0 1 2 3 4 5 6 67 
f | 1627 421 219 130 107 51 15 22 
x 8 9 10 11 12 13 14 15 
f 8 14 5 8 5 0 5} 2 
a. Is it plausible that the population distribution of 
number of contacts is Poisson? Carry out a chi- 

squared test. 
b. The cited article did not even entertain the possibility 
of a Poisson distribution. Instead several other models 


were proposed. One of these is based on the idea that 
each individual’s number of contacts has a Poisson 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


17. 


18. 


19. 


distribution whose mean value p is itself a random vari- 
able having a gamma distribution. This reasoning leads 
to a generalized negative binomial distribution having 
two parameters, which must then be estimated from the 
data. After doing so, the article reported the following 
estimated probabilities corresponding to the foregoing 
x values: .6099, .1657, .0838, .0489, .0305, .0197, 
.0130, .0088, .0060, .0041, .0029, .0020, .0014, .0010, 
.0007, and .0005. Test the plausibility of this model at 
significance level .05 by combining all x values exceed- 
ing 11 into a single category (this was done in the cited 
article, which included a P-value). 


In a genetics experiment, investigators looked at 300 chro- 
mosomes of a particular type and counted the number of 
sister-chromatid exchanges on each (“On the Nature of 
Sister-Chromatid Exchanges in 5-Bromodeoxyuridine- 
Substituted Chromosomes,’ Genetics, 1979: 1251- 
1264). A Poisson model was hypothesized for the distribu- 
tion of the number of exchanges. Test the fit of a Poisson 
distribution to the data by first estimating p and then 
combining the counts for x = 8 and x = 9 into one cell. 


x = Number 


of Exchanges |0 1 2 3 4 5 6 7 8 9 
Observed | 
Counts 6 24 42 59 62 44 41 14 6 2 


The article “A Probabilistic Analysis of Dissolved 
Oxygen-Biochemical Oxygen Demand Relationship in 
Streams” (J. Water Resources Control Fed., 1969: 73-90) 
reports data on the rate of oxygenation in streams at 20°C 
in a certain region. The sample mean and standard devia- 
tion were computed as x = .173 and s = .066, respectively. 
Based on the accompanying frequency distribution, can it 
be concluded that oxygenation rate is a normally distrib- 
uted variable? Use the chi-squared test with a = .05. 


Rate (per day) Frequency 
Below .100 12 
.100-below .150 20 
.150-below .200 23 
.200-below .250 5 
.250 or more 13 


Each headlight on an automobile undergoing an annual 
vehicle inspection can be focused either too high (H), too 
low (ZL), or properly (NV). Checking the two headlights simul- 
taneously (and not distinguishing between left and right) 
results in the six possible outcomes HH, LL, NN, HL, HN, 
and LN. If the probabilities (population proportions) for the 
single headlight focus direction are P(H) = 0,, P(L) = 45, 
and P(N) = 1 — 6, — 6, and the two headlights are focused 
independently of one another, the probabilities of the six 
outcomes for a randomly selected car are the following: 


Pi = OF Pr = 95 p, = (1 — 6, — 0,)? 
Ps = 20,8, Ps = 20,(1 — 6, — 4,) 
Do = 20,11 — 8, — 85) 


20. 


Use the accompanying data to test the null hypothesis 
Ap: Py = 7(8), 85),---5 Pg = T6(91, 95) 
where the 7,(0,, 0,)s are given previously. 


Outcome |e ik, NN HL HN LN 


Frequency | 49 26 14 20 53 38 


[Hint: Write the likelihood as a function of 6, and 6,, 
take the natural log, then compute 0/06, and 4/00,, 
equate them to 0, and solve for 6,, 0,.] 


The article “Compatibility of Outer and Fusible 
Interlining Fabrics in Tailored Garments (Textile 
Res. J., 1997: 137-142) gave the following observations 
on bending rigidity (uN - m) for medium-quality fabric 
specimens, from which the accompanying Minitab out- 
put was obtained: 


24.6 12.7 144 
46.9 68.3 30.8 
25.8 30.9 39.2 


30.6 
116.7 
36.8 


16.1 9.5 
39.5 73.8 
46.6 15.6 


31.5 17.2 
80.6 20.3 
32.3 


Normal Probability Plot 


a! 


Probability 


001 4 


ig 
20 70 120 
bendrig 
Average: 37.4217 W test for normality 


Std Dev: 25.8101 R: 0.9116 
N of data: 23 pvalue (approx): < 0.0100 


21. 


22. 


23. 
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Would you use a one-sample ft confidence interval to 
estimate true average bending rigidity? Explain your 
reasoning. 


The article from which the data in Exercise 20 was 
obtained also gave the accompanying data on the com- 
posite mass/outer fabric mass ratio for high-quality fab- 
ric specimens. 


115 140 1.34 1.29 1.36 1.26 1.22 1.40 
1.29 141 1.32 1.34 1.26 1.36 1.36 1.30 
1.28 145 1.29 1.28 1.38 1.55 146 1.32 


Minitab gave r = .9852 as the value of the Ryan-Joiner 
test statistic and reported that P-value > .10. Would you 
use the one-sample f test to test hypotheses about the 
value of the true average ratio? Why or why not? 


The article “A Method for the Estimation of Alcohol in 
Fortified Wines Using Hydrometer Baumé and 
Refractometer Brix” (Amer. J. of Enol. and Vitic., 
2006: 486-490) gave duplicate measurements on dis- 
tilled alcohol content (%) for a sample of 35 port wines. 
Here are averages of those duplicate measurements: 


15.30 16.20 16.35 17.15 17.48 17.73. 17.75 
17.85 18.00 18.68 18.82 18.85 19.03 19.07 
19.08 19.17 19.20 19.20 19.33 19.37 19.45 
19.48 19.50 19.58 19.60 19.62 19.90 19.97 
20.00 20.05 21.22 22.25 22.75 23.25 23.78 


Use the Ryan-Joiner test to decide at significance level 
.05 whether a normal distribution provides a plausible 
model for alcohol content. 


The article “Nonbloated Burned Clay Aggregate 
Concrete” (J. of Materials, 1972: 555-563) reports the 
following data on 7-day flexural strength of nonbloated 
burned clay aggregate concrete samples (psi): 


257 327 317 300 340 340 343 374 377 386 
383 393 407 407 434 427 440 407 450 440 
456 460 456 476 480 490 497 526 546 700 


Test at level .10 to decide whether flexural strength is a 
normally distributed variable. 


14.3. Two-Way Contingency Tables 


In the scenarios of Sections 14.1 and 14.2, the observed frequencies were displayed 
in a single row within a rectangular table. We now study problems in which the data 
also consists of counts or frequencies, but the data table will now have I rows (J = 2) 
and J columns, so [J cells. There are two commonly encountered situations in which 


such data arises: 


1. There are / populations of interest, each corresponding to a different row of 
the table, and each population is divided into the same J categories. A sam- 
ple is taken from the ith population (i = 1,..., 7), and the counts are entered 
in the cells in the ith row of the table. For example, customers of each of 
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I = 3 department-store chains might have available the same J = 5 payment 
categories: cash, check, and credit cards from American Express, Visa, and 
MasterCard. 


2. There is a single population of interest, with each individual in the population 
categorized with respect to two different factors. There are / categories associated 
with the first factor and J categories associated with the second factor. A single 
sample is taken, and the number of individuals belonging in both category i of 
factor 1 and category j of factor 2 is entered in the cell in row i, column 
J@=1,...,£ 7 = 1,..., J). As an example, customers making a purchase might be 
classified according to both department in which the purchase was made, with I = 6 
departments, and according to method of payment, with J = 5 as in (1) above. 


Let ni denote the number of individuals in the sample(s) falling in the (i, /)th cell 
(row i, column j) of the table—that is, the (7, 7)th cell count. The table displaying 
the n,’s is called a two-way contingency table; a prototype is shown in Table 14.9. 


Table 14.9 A Two-Way Contingency Table 


1 2 ia J boats J 
1 Ny Nyy ines ny; was Nyy 
2 No} 
' mM ny 
I Ny Ny 


In situations of type 1, we want to investigate whether the proportions in the 
different categories are the same for all populations. The null hypothesis states that 
the populations are homogeneous with respect to these categories. In type 2 situa- 
tions, we investigate whether the categories of the two factors occur independently 
of one another in the population. 


Testing for Homogeneity 


Suppose each individual in every one of the J populations belongs in exactly one of 
the same J categories. A sample of n, individuals is taken from the ith population; 
let n = Xn, and 


ny = the number of individuals in the ith sample who fall into category j 


u the total number of individuals among 
n., = > <= 


a =] ‘ the n sample who fall into category j 


The n;’s are recorded in a two-way contingency table with / rows and J columns. 
The sum of the n,’s in the ith row is n,, and the sum of entries in the jth column will 
be denoted by 7. ,. 

Let 


__ the proportion of the individuals in 


4 population i who fall into category j 
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Thus, for population 1, the J proportions are p,,, P)>,--., PP, (which sum to 1) and 
similarly for the other populations. The null hypothesis of homogeneity states that 
the proportion of individuals in category j is the same for each population and that 
this is true for every category; that is, for every j, p,; = Pr; = °° = Pj- 

When H, is true, we can use p,, P,..., p; to denote the population proportions 
in the J different categories; these proportions are common to all J populations. 
The expected number of individuals in the ith sample who fall in the jth category 
when A is true is then E(N;;) =n, * P;. To estimate EN ;,) we must first estimate Pp 
the proportion in category j. Among the total sample of n individuals, N's fall into 
category j, so we use p; = N jf nas the estimator (this can be shown to be the maxi- 
mum likelihood estimator of p,). Substitution of the estimate P; for p; in n,p; yields a 
simple formula for estimated expected counts under 1): 


n.. 
é,, = estimated expected count in cell (i, 7) = 7; ° 7 


__ (ith row total)( jth column total) 


(14.9) 


n 


The test statistic here has the same form as in Sections 14.1 and 14.2. The num- 
ber of degrees of freedom comes from the general rule of thumb. In each row of 
Table 14.9 there are J — | freely determined cell counts (each sample size n, is 
fixed), so there are a total of (J — 1) freely determined cells. Parameters p,,..., P; 
are estimated, but because Xp; = 1, only J — 1 of these are independent. Thus df = 
I(J-1-VJ-)D=VU-)CU- J. 


Null hypothesis: 25: p\,—= Py —-" =py j= 12.2.4 
Alternative hypothesis: H,: Hp is not true 


Test statistic value: 


Bere 2 
= Ss (observed — estimated expected) a 
estimated expected 


all cells 


When H, is true and ej = 5 for all i, j, the test statistic has approximately a 
chi-squared distribution with (J — 1)(J — 1) df. The test is again upper-tailed, 
so the P-value is the area under the x7_;),—,, curve to the right of the calcu- 
lated x”. Table A.11 can be used to obtain P-value information as described 
in Section 14.1. 


EXAMPLE 14.13 A company packages a particular product in cans of three different sizes, each one 
using a different production line. Most cans conform to specifications, but a quality 
control engineer has identified the following reasons for nonconformance: 

1. Blemish on can 

Crack in can 

Improper pull tab location 

Pull tab missing 

Other 


wpe N 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


642 CHAPTER 14 Goodness-of-Fit Tests and Categorical Data Analysis 


A sample of nonconforming units is selected from each of the three lines, and each 
unit is categorized according to reason for nonconformity, resulting in the following 
contingency table data: 


Reason for Nonconformity 


Sample 
Blemish Crack Location Missing Other Size 
1 31 68 17 21 13 150 
Production Line 2 19 47 30 19 10 125 
3 33 26 16 14 11 100 
Total 83 141 63 54 34 375 


Does the data suggest that the proportions falling in the various nonconformance 
categories are not the same for the three lines? The parameters of interest are the 
various proportions, and the relevant hypotheses are 


H): the production lines are homogeneous with respect to the five noncon- 
formance categories; that is, p); = Pp, = p3, forj=1,...,5 
H,: the production lines are not homogeneous with respect to the categories 


The estimated expected frequencies (assuming homogeneity) must now be calcu- 
lated. Consider the first nonconformance category for the first production line. When 
the lines are homogeneous, 

estimated expected number among the 150 selected units that are blemished 


(first row total)(first column total) (150)(83) 


= 33.20 
total of sample sizes 375 
The contribution of the cell in the upper-left corner to y is then 
(observed — estimated expected)? (31 — 33.20)? 146 


estimated expected 33.20 


The other contributions are calculated in a similar manner. Figure 14.5 shows 
Minitab output for the chi-squared test. The observed count is the top number in 
each cell, and directly below it is the estimated expected count. The contribution of 


Expected counts are printed below observed counts 
Chi-Square contributions are printed below expected counts 


Blemish Crack Location Missing Other Total 

1 31 68 aly 21 13 150 
33.20 56.40 25.20 21.60 13.60 
0.146 2.386 2.668 0.017 0.026 

2 19 47 30 19 10 125 
27 61 47.00 21.00 18.00 LA. 3:3 
2.715 0.000 3.857 0.056 0.157 

3 33 26 16 14 aa 100 
22.13 37.60 16.80 14.40 9.07 
5.335 22079 0.038 0.011 0.412 

Total 83 141 63 54 34 375 


Chi-Sq = 21.403, DF = 8, P-Value = 0.006 


Figure 14.5 Minitab output for the chi-squared test of Example 14.13 
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each cell to y* appears below the counts, and the test statistic value is vy? = 21.403. 
All estimated expected counts are at least 5, so combining categories is unneces- 
sary. The test is based on (3 — 1)(5 — 1) = 8 df. Appendix Table A.11 shows that 
the area under the 8 df chi-squared curve to the right of 20.09 is .010 and the area 
to the right of 21.95 is .005. Therefore we can say that .005 < P-value < .01; 
Minitab gives P-value = .006. Using a significance level of .01, the null hypoth- 
esis of homogeneity can be rejected in favor of the alternative that the distribution 
of reason for nonconformity is somehow different for the three production lines. 

At this point it is desirable to seek an explanation for why the hypothesis of 
homogeneity is implausible. Figure 14.6 shows a stacked comparative bar chart of 
the data. It appears that the three lines are relatively homogenous with respect to 
the Other and Missing categories but not with respect to the Location, Crack, and 
Blemish categories. Line 1’s incidence rate of crack nonconformities is much higher 
than for the other two lines, whereas location nonconformities appear to be more of 
a problem for line 2 than for the other two lines and blemish nonconformities occur 
much more frequently for line 3 than for the other two lines. 


100 + Reason 
Other 
80 4 Missing 
@ Location 
Ss @ Crack 
= 60 4 @ Blemish 
a 
E 
5B 40- 
‘S) 
20 5 
0 
Linel Line2 Line3 
Figure 14.6 Stacked comparative bar chart for the data of Example 14.13 fi] 


Testing for Independence (Lack of Association) 


We focus now on the relationship between two different factors in a single popula- 
tion. Each individual in the population is assumed to belong in exactly one of the / 
categories associated with the first factor and exactly one of the J categories associ- 
ated with the second factor. For example, the population of interest might consist 
of all individuals who regularly watch the national news on television, with the first 
factor being preferred network (ABC, CBS, NBC, or PBS, so J = 4) and the second 
factor political philosophy (liberal, moderate, or conservative, giving J = 3). 

For a sample of n individuals taken from the population, let n, denote the 
number among the n who fall both in category i of the first factor and category j of 
the second factor. The n;’s can be displayed in a two-way contingency table with 
T rows and J columns. In the case of homogeneity for J populations, the row totals 
were fixed in advance, and only the J column totals were random. Now only the 
total sample size is fixed, and both the n,’s and n_;’s are observed values of random 
variables. To state the hypotheses of interest, let 


pj, = the proportion of individuals in the population who belong in category i 
of factor | and category j of factor 2 


= P(a randomly selected individual falls in both category i of factor 1 and 
category j of factor 2) 
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Then 


Di. = Py p = P(a randomly selected individual falls in category i of factor 1) 
j 


Pj = pS p; = P(a randomly selected individual falls in category j of factor 2) 


Recall that two events, A and B, are independent if P(A M B) = P(A) - P(B). The 
null hypothesis here says that an individual’s category with respect to factor 1 
is independent of the category with respect to factor 2. In symbols, this becomes 
Pj = Pi. - p.; for every pair (i, j). 

The expected count in cell (i, j) is n - p;, so when the null hypothesis is true, 
E(N;;) = 1 + p;. - p.;. To obtain a chi-squared statistic, we must therefore estimate the 
p;.8@ = 1,..., 7) and ps = 1,..., J). The (maximum likelihood) estimates are 


nj. 
B,. = = = sample proportion for category i of factor | 
and 


pos = = sample proportion for category j of factor 2 


This gives estimated expected cell counts identical to those in the case of homogeneity. 


non n 
__ (ith row total)(th column total) 


n 


The test statistic is also identical to that used in testing for homogeneity, as is the number 
of degrees of freedom. This is because the number of freely determined cell counts is 
IJ — 1, since only the total 7 is fixed in advance. There are J estimated p,.’s, but only J — 1 
are independently estimated since Xp;. = 1; and similarly J — Ip.,’s are independently 
estimated, so J + J — 2 parameters are independently estimated. The rule of thumb now 
yields df = J-1-U+J-—2)=NV-I-J+1=(-1):VU-1). 


Null hypothesis: ©, pi —p.7p;, 0 t= te hi = 1. 
Alternative hypothesis: H,: H, is not true 


Test statistic value: 


x= 


> (observed — estimated expected)* a > > (nj — ey) 


allesiis estimated expected ne 


i 
™> 


When H, is true and ej = 5 for all i, j, the test statistic has approximately a 
chi-squared distribution with (/ — 1)(J — 1) df. The test is again upper-tailed, 
so the P-value is the area under the x7,_,);—,, curve to the right of the calcu- 
lated x. Table A.11 can be used to obtain P-value information as described 
in Section 14.1. 
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EXAMPLE 14.14 The accompanying two-way table from Minitab (Table 14.10) gives a cross- 
classification in which the row factor is level of paternal education (completed univer- 
sity, partial university, secondary, partial secondary) and the column factor represents 
the quartile of neonatal (i.e., newborn) weight gain (QI = lowest 25%, Q2 = next 
lowest 25%, Q3, Q4); the data appeared in the article “Impact of Neonatal Growth 
on IQ and Behavior at Early School Age” (Pediatrics, July 2013, e53-60). Does 
it appear that educational level is independent of NWG in the sampled population? 


Table 14.10 Observed and Estimated Expected Counts for Example 14.14 


Expected counts are printed below observed counts 
Chi-Square contributions are printed below expected counts 


Ql Q2 Q3 Q4 Total 

El 422 433 429 414 1698 
411.63 444.79 422.64 418.93 
0.261 0.313 0.096 0..-05:8 

E2 1493 1655 1556 1605 6309 
1529.44 1652.65 1570.35 1556.56 
0.868 0.003 0.131 1.508 

E3 1239 1276 1243 1179 4937 
1196.84 1293.25 1228.85 1218.06 
1.485 0.230 0.163 1.252 

E4 61 110 73 74 318 
77.09 83.30 79.15 78.46 
3.358 8.558 0.478 0.253 

Total 3215 3474 3301 3272 13262 


The contribution to x” from the cell in the upper-left corner is (422 — 411.63)?/411.63 = 
.261. The 15 other contributions are calculated in the same way. Then x? = .261 + 
* + .253 = 19.016. When Hj is true, the test statistic has approximately a chi- 
squared distribution with (4 — 1)(4 — 1) = 9 df. The expected value of a chi-squared 
rv is just its number of degrees of freedom, so E(x?) = 9 under the assumption of 
independence. Clearly the test statistic value exceeds what would be expected if the 
two factors were independent, but is it by enough to suggest implausibility of this 
null hypothesis? Table A.11 shows that .025 is the area to the right of 19.02 under the 
chi-squared curve with 9 df. Thus the P-value for the test is roughly .025 (which is 
the value calculated by Minitab; the cited article reported .03). At significance level 
.05, the null hypothesis of independence would be rejected since P-value ~ .025 = 
.OS = a. However, this conclusion would not be justified at a significance level of 
.O1. The P-value is such that people might argue over what conclusion is appropriate. 
Someone persuaded by our analysis to reject the assertion of independence 
would want to look more closely at the data to seek an explanation for that conclu- 
sion. Perhaps, for example, those in a higher quartile tend to have higher educational 
levels. Figure 14.7 shows histograms (bar graphs) of the percentages in the various 
educational level categories for each of the four different quartiles. The four histo- 
grams appear to be very similar; the visual impression is that the distribution over the 
four educational levels does not depend much on the NWG quartile. This seemingly 
contradicts the finding of statistical significance. Now note that the sample size here 
is extremely large, and this inflates the value of the chi-squared statistic. With the 
same percentages as in Figure 14.7 but a much more moderate sample size, the value 
of x” would be much smaller and the P-value much larger. Our test result achieved 
statistical significance, but there does not seem to be any practical significance. 
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Ql Q2 Q3 Q4 
Figure 14.7 Histograms based on the data of Example 14.14 a 


Models and methods for analyzing data in which each individual is catego- 
rized with respect to three or more factors (multidimensional contingency tables) are 
discussed in several of the chapter references. 


EXERCISES Section 14.3 (24—36) 


24. 


The accompanying two-way table was constructed using ChiSgq = 3.557+ 0.579 + 

data in the article “Television Viewing and Physical 0.014 + 0.002 + 

Fitness in Adults” (Research Quarterly for Exercise and 1.400 + 0.228 + 

Sport, 1990: 315-320). The author hoped to determine 0.328 + 0.053 =6.161 

whether time spent watching television is associated with df =3 

cardiovascular fitness. Subjects were asked about their 25. In an investigation of alcohol use among college stu- 


television-viewing habits and were classified as physically 
fit if they scored in the excellent or very good category on 
a step test. We include Minitab output from a chi-squared 
analysis. The four TV groups corresponded to different (“Alcohol Use in Students Seeking Primary Care 


amounts of time per day spent watching TV (0, 1-2, 3-4, Treatment at University Health Services,” J. of Amer. 
or 5 or more hours). The 168 individuals represented in College Health, 2012: 217-225) 
, 3 : 


the first column were those judged physically fit. Expected 
counts appear below observed counts, and Minitab dis- 


dents, each male student in a sample was categorized 
both according to age group and according to the number 
of heavy drinking episodes during the previous 30 days 


ae g Age Group 
on the aoa to x from each i State and test 18-20 21-23 >24 
e appropriate hypotheses using a = .05. 
ee ee - None 357.293.5922 
1 2 Total 1-2 218 285 354 
. ae — ie # Episodes 34 184 218 185 
25.48 156.52 
2 101 629 730 =5 328 331 147 
102.20 627.80 
3 28 222 250 Does there appear to be an association between extent 
35.00 215.00 : ads ; ‘ 
2 7 a 5 of binge drinking and age group in the population from 
5.32 32.68 which the sample was selected? Carry out a test of 
Total 168 1032 1200 hypotheses at significance level .01. 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


26. 


27. 


28. 


29. 


Contamination of various food products is an ongoing 
problem all over the world. The article ‘Prevalence and 
Quantitative Detection of Salmonella in Retail Raw 
Chicken in Shaanxi, China” (J. of Food Production, 
2013) reported the following data on the occurrence of 
salmonella in chicken of three different types: (1) super- 
market chilled, (2) supermarket frozen, and (3) wet 
market fresh slaughtered. 


# Salmonella 
Sample Size Positive Samples 


1. 60 27 
Type 2. 60 32 
3. 120 45 


Does it appear that the incidence rate of salmonella 
occurrence depends on the type of chicken? State and 
test the appropriate hypotheses using a significance level 
of .05. 


The article “Human Lateralization from Head to Foot: 
Sex-Related Factors” (Science, 1978: 1291-1292) 
reports for both a sample of right-handed men and a 
sample of right-handed women the number of individuals 
whose feet were the same size, had a bigger left than right 
foot (a difference of half a shoe size or more), or had a 
bigger right than left foot. 


Sample 


L>R Size 


Men 2 10 28 40 


Women 55 18 14 87 


Does the data indicate that gender has a strong effect on 
the development of foot asymmetry? State and test the 
appropriate hypotheses. 


A random sample of 175 Cal Poly State University stu- 
dents was selected, and both the email service provider 
and cell phone provider were determined for each one, 
resulting in the accompanying data. State and test the 
appropriate hypotheses 


Cell Phone Provider 


ATT Verizon Other 


gmail 28 17 7 
Yahoo 31 26 10 
Other 26 19 11 


Email Provider 


The accompanying data on degree of spirituality for 
samples of natural and social scientists at research univer- 
sities as well as for a sample of non-academics with 
graduate degrees appeared in the article “Conflict 
Between Religion and Science Among Academic 
Scientists” (J. for the Scientific Study of Religion, 2009: 
276-292). 


30. 


31. 


32. 
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Degree of Spirituality 


Very Moderate Slightly Not at all 
N.S. 56 162 198 211 
SS. 56 223 243 239 
G.D. 109 164 74 28 


a. Is there substantial evidence for concluding that the 
three types of individuals are not homogenous with 
respect to their degree of spirituality? State and test 
the appropriate hypotheses. 

b. Considering just the natural scientists and social 
scientists, is there evidence for non-homogeneity? 


Three different design configurations are being consid- 
ered for a particular component. There are four possible 
failure modes for the component. An engineer obtained 
the following data on number of failures in each mode 
for each of the three configurations. Does the configura- 
tion appear to have an effect on type of failure? 


Failure Mode 


1 2 3 4 


1 20 44 #17 9 
Configuration 2 4 17 7 12 
3 10 31 14 5 


A random sample of smokers was obtained, and each 
individual was classified both with respect to gender 
and with respect to the age at which he/she first started 
smoking. The data in the accompanying table is con- 
sistent with summary results reported in the article 
“Cigarette Tar Yields in Relation to Mortality in 
the Cancer Prevention Study II Prospective 
Cohort” (British Med. J., 2004: 72-79). 


Gender 
Male Female 
<16 25 10 
Ave 16-17 24 32 
se 18-2028 17 
>20 19 34 


a. Calculate the proportion of males in each age category, 
and then do the same for females. Based on these pro- 
portions, does it appear that there might be an associa- 
tion between gender and the age at which an individu- 
al first smokes? 

b. Carry out a test of hypotheses to decide whether 
there is an association between the two factors. 


Eclosion refers to the emergence of an adult insect from 
an egg. The following data on eclosion rates when 
nymphs were exposed to heat for various durations was 
extracted from the article “High Temperature 
Determines the Ups and Downs of Small Brown 
Planthopper Laodelphax Striatellus Population” 
Unsect Science, 2012: 385-392). 
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Duration (d) 0 1 2 3 5 10 15 
Sample size 120 41 47 44 46 42 10 
# Emerged: 101. 38 «644 «640 #38 = «635 ei 


Carry out a chi-squared test to decide whether it is 
plausible that eclosion rate does not depend on exposure 
duration (the cited article included summary information 
from the test). 


33. Show that the chi-squared statistic for the test of inde- 
pendence can be written in the form 


lod Nj 

y= > >( “ —n 

i=1 j=1\ E, 

Why is this formula more efficient computationally than 
the defining formula for y*? 


34. Suppose that each student in a sample had been catego- 
rized with respect to political views, marijuana usage, 
and religious preference, with the categories of this latter 
factor being Protestant, Catholic, and other. The data could 
be displayed in three different two-way tables, one 
corresponding to each category of the third factor. With 
Pig = P(political category i, marijuana category j, and 
religious category k), the null hypothesis of independence 
of all three factors states that pj, = p;..P.;-P..,- Let nix 
denote the observed frequency in cell (i, j, k). Show how 
to estimate the expected cell counts assuming that Hp is 
true (@,, = NPj4, 80 the p,,’8 must be determined). Then 
use the general rule of thumb to determine the number of 
degrees of freedom for the chi-squared statistic. 


35. 


36. 


Suppose that in a particular state consisting of four distinct 
regions, a random sample of n, voters is obtained from the 
kth region for k = 1, 2, 3, 4. Each voter is then classified 
according to which candidate (1, 2, or 3) he or she prefers 
and according to voter registration (1 = Dem., 2 = Rep., 
3 = Indep.). Let p,, denote the proportion of voters in 
region k who belong in candidate category i and registration 
category j. The null hypothesis of homogeneous regions is 
Ay: Pia = Pi = Pix = Pia for all i, j (e., the proportion 
within each candidate/registration combination is the same 
for all four regions). Assuming that H, is true, determine pj, 
and é@;, as functions of the observed n;,’s, and use the 
general rule of thumb to obtain the number of degrees of 
freedom for the chi-squared test. 


Consider the accompanying 2 X 3 table displaying the 
sample proportions that fell in the various combinations 
of categories (e.g., 13% of those in the sample were in 
the first category of both factors). 


1 2 3 
1 13 19 28 
2 07 ll 22 


a. Suppose the sample consisted of n = 100 people. 
Use the chi-squared test for independence with sig- 
nificance level .10. 

b. Repeat part (a), assuming that the sample size was 
n = 1000. 

c. What is the smallest sample size n for which these 
observed proportions would result in rejection of the 
independence hypothesis? 


SUPPLEMENTARY EXERCISES (37-49) 


37. The article “Birth Order and Political Success” (Psych. 
Reports, 1971: 1239-1242) reports that among 31 ran- 
domly selected candidates for political office who came 
from families with four children, 12 were firstborn, 11 
were middle born, and 8 were last born. Use this data to 
test the null hypothesis that a political candidate from 
such a family is equally likely to be in any one of the four 
ordinal positions. 


38. Does the phase of the moon have any bearing on 
birthrate? Each of 222,784 births that occurred during 
a period encompassing 24 full lunar cycles was clas- 
sified according to lunar phase. The following data is 
consistent with summary quantities that appeared in 
the article “The Effect of the Lunar Cycle on 
Frequency of Births and Birth Complications” 
(Amer. J. of Obstetrics and Gynecology, 2005: 
1462-1464). 
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39. 


Lunar Phase # Days in Phase # Births 
New moon 24 7680 
Waxing crescent 152 48,442 
First quarter 24 7579 
Waxing gibbous 149 47,814 
Full moon 24 7711 
Waning gibbous 150 47,595 
Last quarter 24 77133 
Waning crescent 152 48,230 


State and test the appropriate hypotheses to answer the 
question posed at the beginning of this exercise. 


Each individual in a sample of nursing home patients 
was cross-classified both with respect to cognitive state 
(normal or mild impairment, moderate impairment, 
severe impairment) and with respect to drug status (psy- 
chotropic drug change, psychotropic user without a 


40. 


change, no psychotropic medication). The following 
Minitab output resulted from a request to perform a chi- 
squared analysis. 


Drug change No change No med Total 

Normal 83 60 46 189 
90.06 64.11 34.83 
0.554 0.263 3.584 

Moderate 237 151 70 458 
218.25 155.35 84.40 
1.611 0.122 2.456 

Severe 86 78 41 205 
OE 69) 69.54 37.78 
1.398 1.030 0.275 

Total 406 289 LST 852 

Chi-Sq = 11.294, DF = 4, P-Value = 0.023 


(“Psychotropic Drug Initiation or Increased Dosage 
and the Acute Risk of Falls,”’ BMC Geriatrics, 2013: 
13:19). 

a. Verify the expected frequency and contribution 
to x” in the normal—drug change cell of the two-way 
table. 

b. Does there appear to be an association between cog- 
nitive state and drug status? State and test the appro- 
priate hypotheses using a significance level of .01. 
[Note: The cited article reported a P-value.] 


The authors of the article “Predicting Professional 
Sports Game Outcomes from Intermediate Game 
Scores” (Chance, 1992: 18-22) used a chi-squared test to 
determine whether there was any merit to the idea that 
basketball games are not settled until the last quarter, 
whereas baseball games are over by the seventh inning. 
They also considered football and hockey. Data was col- 
lected for 189 basketball games, 92 baseball games, 
80 hockey games, and 93 football games. The games ana- 
lyzed were sampled randomly from all games played dur- 
ing the 1990 season for baseball and football and for the 
1990-1991 season for basketball and hockey. For each 
game, the late-game leader was determined, and then it 
was noted whether the late-game leader actually ended up 
winning the game. The resulting data is summarized in the 
accompanying table. 


Late-Game Late-Game 
Sport Leader Wins Leader Loses 
Basketball 150 39 
Baseball 86 6 
Hockey 65 15 
Football 72 21 


The authors state that “Late-game leader is defined as 
the team that is ahead after three quarters in basketball 
and football, two periods in hockey, and seven innings 
in baseball. The chi-square value on three degrees of 
freedom is 10.52 (P < .015).” 


41. 


42. 


43. 


44, 


Supplementary Exercises 649 

a. State the relevant hypotheses and reach a conclusion 
using a = .05. 

b. Do you think that your conclusion in part (a) can be 
attributed to a single sport being an anomaly? 


The accompanying two-way frequency table appears 
in the article “Marijuana Use in College” (Youth and 
Society, 1979: 323-334). Each of 445 college students 
was classified according to both frequency of mari- 
juana use and parental use of alcohol and psychoactive 
drugs. Does the data suggest that parental usage and 
student usage are independent in the population from 
which the sample was drawn? 


Standard Level of 
Marijuana Use 


Never Occasional Regular 
Neither 141 54 40 
Parental 
Use of One 68 44 51 
Alcohol 
and Drugs Both 7 i 19 


Much attention has recently focused on the incidence 
of concussions among athletes. Separate samples of 
soccer players, non-soccer athletes, and non-athletes 
were selected. The accompanying table then resulted 
from determining the number of concussions each 
individual reported on a medical history questionnaire 
(“No Evidence of Impaired Neurocognitive 
Performance in Collegiate Soccer Players,” Amer. J. 
of Sports Med., 2002: 157-162). 


# of Concussions 


0 1 2 =3 
Soccer 45 25 11 10 
N-S Athletes 68 15 8 5 
Non-athletes 45 5 3 0 


Does the distribution of # of concussions appear to be 
different for the three types of individuals? Carry out a 
test of hypotheses. 


In a study to investigate the extent to which individuals 
are aware of industrial odors in a certain region 
(“Annoyance and Health Reactions to Odor from 
Refineries and Other Industries in Carson, 
California,’ Environmental Research, 1978: 119-132), 
a sample of individuals was obtained from each of three 
different areas near industrial facilities. Each individual 
was asked whether he or she noticed odors (1) every day, 
(2) at least once/week, (3) at least once/month, (4) less 
often than once/month, or (5) not at all, resulting in the 
data and SPSS output at the bottom of the next page. 
State and test the appropriate hypotheses. 


Many shoppers have expressed unhappiness because 
grocery stores have stopped putting prices on individual 
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grocery items. The article “The Impact of Item Price won its regional tournament 22 times, the second-ranked 

Removal on Grocery Shopping Behavior’ (J. of team won 10 times, the third-ranked team won 5 times, 

Marketing, 1980: 73-93) reports on a study in which and the remaining 11 regional tournaments were won by 

each shopper in a sample was classified by age and by teams ranked lower than 3. Let P;; denote the probability 

whether he or she felt the need for item pricing. Based that the team ranked 7 in its region is victorious in its 

on the accompanying data, does the need for item pric- game against the team ranked j. Once the P;’s are avail- 

ing appear to be independent of age? able, it is possible to compute the probability that any 

particular seed wins its regional tournament (a compli- 

Age cated calculation because the number of outcomes in the 

<30 30-39 40-49 50-59 >60 sample space is quite large). The paper ‘‘Probability 

Models for the NCAA Regional Basketball 

Number in Tournaments” (American Statistician, 1991: 35-38) 
Sample 150 141 82 63 49 proposed several different models for the P,’s. 


a. One model postulated P,; =.5—A(Gi—j) with 
A = 1/32 (from which P),, =A, Pi. = 2A, etc.). 
Based on this, P(seed # 1 wins) = .27477, P(seed #2 
wins) = .20834, and P(seed # 3 wins) = .15429. 
Does this model appear to provide a good fit to the 
data? 


Number 
Who Want =‘ 127 118 77 61 41 
Item Pricing 


45. Let p, denote the proportion of successes in a particular 


lation. The test statisti lue in Chapter 8 for test- rae iqa8 
popt oe eens sche POG an b. A more sophisticated model has game probabilities 
ing Hp: Py = Pio Was Z = (By — Pyo)/VPioP29/ 7, where 


Px = 1 — Py. Show that for the case k = 2, the chi- ne zat (z, — z), where the z’s are mea- 
7 _ 9 sures of relative strengths related to standard normal 


squared test statistic value of Section 14.1 satisfies x? = z’. . ; ; : 
[Hints First show that Gi) — mpi? = Gy — ns] percentiles [percentiles for successive highly seeded 
' Ho 2 mn teams are closer together than is the case for teams 


46. The NCAA basketball tournament begins with 64 teams seeded lower, and .2813625 ensures that the range of 
that are apportioned into four regional tournaments, probabilities is the same as for the model in part (a)]. 
each involving 16 teams. The 16 teams in each region The resulting probabilities of seeds 1, 2, or 3 winning 
are then ranked (seeded) from | to 16. During the their regional tournaments are .45883, .18813, and 
12-year period from 1991 to 2002, the top-ranked team .11032, respectively. Assess the fit of this model. 


SPSS output for Exercise 43 
Crosstabulation: AREA BY CATEGORY 


Count 
Exp Val 
CATEGORY ——~ Row Pct Row 
AREA Col Pct 1.00 2.00 3.00 4.00 5.00 Total 
1.00 20 28 23 14 2 97 
V2 37 24.7 18.0 16.0 25.7 33.3% 
20.6% 28.9% 23.675) 14.4% 12.4% 
52.6% 37.8% 42.6% 29.2% 15.6% 
a a ee ee ee 
2.00 14 34 21 14 12 95 
12.4 24.2 17.6 15...7 2521 32.6% 
14.7% 35.8% 22.1% 14.7% 12.6% 
36.8% 45.9% 38.9% 29.2% 15.6% 
3.00 4 12 10 20 53 99 
12.9 25.2 18.4 16.3 26.2 34.0% 
4.0% 12.1% 10.1% 20.2% 53.5% 
10.5% 16.2% 18.5% 41.7% 68.8% 
Column 38 74 54 48 77 291 
Total 13.1% 25.4% 18.6% 16.5% 26.5% 100.0% 
Chi-Square DB. Significance Min E.F. Cells with E.F. <5 
70.64156 8 . 0000 12.405 None 
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47. Have you ever wondered whether soccer players suffer 
adverse effects from hitting “headers”? The authors of 
the article ‘‘No Evidence of Impaired Neurocognitive 
Performance in Collegiate Soccer Players” (Amer. J. 
of Sports Med., 2002: 157-162) investigated this issue 
from several perspectives. 

a. The paper reported that 45 of the 91 soccer players 
in their sample had suffered at least one concussion, 
28 of 96 nonsoccer athletes had suffered at least one 
concussion, and only 8 of 53 student controls had 
suffered at least one concussion. Analyze this data 
and draw appropriate conclusions. 

b. For the soccer players, the sample correlation 
coefficient calculated from the values of x = soc- 
cer exposure (total number of competitive seasons 
played prior to enrollment in the study) and y = 
score on an immediate memory recall test was 
r = —.220. Interpret this result. 

c. Here is summary information on scores on a con- 
trolled oral word-association test for the soccer and 
nonsoccer athletes: 


n, = 26, ¥, = 37.50, 8, = 9.13 
ny = 56, ¥> = 39.63, 5) = 10.19 


Analyze this data and draw appropriate conclusions. 

d. Considering the number of prior nonsoccer concus- 
sions, the values of mean + sd for the three groups 
were .30 + .67, .49 + .87, and .19 + .48. Analyze 
this data and draw appropriate conclusions. 


48. Do the successive digits in the decimal expansion of 7 
behave as though they were selected from a random 
number table (or came from a computer’s random num- 
ber generator)? 
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a. Let py denote the long run proportion of digits in 
the expansion that equal 0, and define p,,..., Do 
analogously. What hypotheses about these propor- 
tions should be tested, and what is df for the chi- 
squared test? 

b. 1 of part (a) would not be rejected for the nonrandom 
sequence 012...901...901.... Consider nonoverlap- 
ping groups of two digits, and let p,, denote the long 
run proportion of groups for which the first digit is 
i and the second digit is 7. What hypotheses about 
these proportions should be tested, and what is df for 
the chi-squared test? 

c. Consider nonoverlapping groups of 5 digits. Could a 
chi-squared test of appropriate hypotheses about the 
PijtimS be based on the first 100,000 digits? Explain. 

d. The article “Are the Digits of 7 an Independent and 
Identically Distributed Sequence?” (The American 
Statistician, 2000: 12-16) considered the first 1,254,540 
digits of 7, and reported the following P-values for 
group sizes of 1,..., 5: .572, .078, .529, .691, .298. What 
would you conclude? 


The Fibonacci sequence of numbers occurs in various 
scientific contexts. The first two numbers in the sequence 
are 1,1. Then every succeeding number is the sum of the 
two previous numbers: 1, 1,1 + 1 = 2,1+2=3,24 
3 = 5,8, 13,21, ....The first digit of any number in this 
sequence can be 1, 2,..., or 9. The frequencies of first 
digits for the first 85 numbers in the sequence are as fol- 
lows: 25 (1’s), 16 (2’s), 11, 7, 7, 5, 4, 6, 4. Does the 
distribution of first digits in the Fibonacci sequence 
appear to be consistent with the Benford’s Law distribu- 
tion described in Exercise 21 of Chapter 3? State and 
test the relevant hypotheses. 


survey of methods for analyzing categorical data, exposited 
with a minimum of mathematics. 

Mosteller, Frederick, and Richard Rourke, Sturdy Statistics, 
Addison-Wesley, Reading, MA, 1973. Contains several 
very readable chapters on the varied uses of chi-square. 
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Distribution-Free 


Procedures 


INTRODUCTION 


When the underlying population or populations are nonnormal, the t and F 
tests and t confidence intervals of Chapters 7-13 will in general have actual 
levels of significance or confidence levels that differ from the nominal levels 
(those prescribed by the experimenter through the choice of, say, t 995, Fo, etc.) 
a and 100(1 — a)%, although the difference between actual and nominal levels 
may not be large when the departure from normality is not too severe. Because 
the t and F procedures require the distributional assumption of normality, they 
are not “distribution-free” procedures—alternatively, because they are based 
on a particular parametric family of distributions (normal), they are not “non- 
parametric” procedures. 

In this chapter, we describe procedures that are valid [actual significance 
level a or confidence level 100(1 — a)%] simultaneously for many different 
types of underlying distributions. Such procedures are called distribution-free 
or nonparametric. One- and two-sample test procedures are presented in 
Sections 15.1 and 15.2, respectively. In Section 15.3, we develop distribution- 
free confidence intervals. Section 15.4 describes distribution-free ANOVA 
procedures. These procedures are all competitors of the parametric (t and F) 
procedures described in previous chapters, so it is important to compare the 
performance of the two types of procedures under both normal and nonnor- 
mal population models. Generally speaking, the distribution-free procedures 
perform almost as well as their t and F counterparts on the “home ground” of 
the normal distribution and will often yield a considerable improvement under 
nonnormal conditions. 


652 
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15.1 The Wilcoxon Signed-Rank Test 


A research chemist performed a particular chemical experiment a total of ten times under 
identical conditions, obtaining the following ordered values of reaction temperature: 


—.57 -—.19 -.05 .76 1.30 2.02 2.17 2.46 2.68 3.02 


The distribution of reaction temperature is of course continuous. Suppose the 
investigator is willing to assume that the reaction temperature distribution is 
symmetric; that is, there is a point of symmetry such that the density curve to the 
left of that point is the mirror image of the density curve to its right. This point of 
symmetry is the median p of the distribution (and is also the mean value p pro- 
vided that the mean is finite). The assumption of symmetry may at first thought 
seem quite bold, but remember that any normal distribution is symmetric, so sym- 
metry is actually a weaker assumption than normality. 

Let’s now consider testing Hy: # = 0 versus H,: > 0. The null hypothesis 
can be interpreted as saying that a temperature of any particular magnitude, for 
example, 1.50, is no more likely to be positive (+1.50) than it is to be negative 
(—1.50). A glance at the data suggests that this hypothesis is not very tenable; for 
example, the sample median is 1.66, which is far larger than the magnitude of any 
of the three negative observations. 

Figure 15.1 shows two different symmetric pdf’s, one for which H, is true and 
one for which H, is true. When H, is true, we expect the magnitudes of the negative 
observations in the sample to be comparable to the magnitudes of the positive obser- 
vations. If, however, H) is “grossly” untrue as in Figure 15.1(b), then observations of 
large absolute magnitude will tend to be positive rather than negative. 


(a) (b) 


Figure 15.1 Distributions for which (a) 2 = 0; (b) ~ >> 0 


For the sample of ten reaction temperatures, let’s for the moment disregard the 
signs of the observations and rank the absolute magnitudes from | to 10, with the 
smallest getting rank 1, the second smallest rank 2, and so on. Then apply the sign 
of each observation to the corresponding rank to obtain signed ranks. Typically 
some signed ranks will be negative (e.g., —3), whereas others will be positive 
(e.g., 8). The test statistic will be S, = the sum of the positively signed ranks. 


Absolute 

Magnitude 05 19 57 76 130 202 2.17 246 268 3.02 
Rank 1 2 3 4 5 6 7 8 9 10 
Signed 

Rank -l1 -2 -3 4 5) 6 7 8 9 10 


s,=44+54+6+4+74+84+9+10=49 
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When the median of the distribution is much greater than 0, most of the observations 
with large absolute magnitudes should be positive, resulting in positively signed 
ranks and a large value of s,. On the other hand, if the median is 0, magnitudes 
of positively signed observations should be intermingled with those of negatively 
signed observations, in which case s, will not be very large. So intuitively, a larger 
s, value provides more evidence against H) than does a smaller value. This implies 
that the test is upper-tailed: The P-value will be P)(S_. = s,.), where P, represents the 
probability calculated assuming that H) is true. Thus we must determine the distribu- 
tion of S$, when the null hypothesis is true—that is, its null distribution. 

Consider n = 5, in which case there are 2° = 32 ways of applying signs to the 
five ranks 1, 2, 3, 4, and 5 (each rank could have a — sign or a + sign). The key 
point is that when H, is true, any collection of five signed ranks has the same chance 
as does any other collection. That is, the smallest observation in absolute magnitude 
is equally likely to be positive or negative, the same is true of the second smallest 
observation in absolute magnitude, and so on. Thus the collection —1, 2, 3, —4, 5 
of signed ranks is just as likely as the collection 1, 2, 3, 4, —5, and just as likely as 
any one of the other 30 possibilities. 

Table 15.1 lists the 32 possible signed-rank sequences when n = 5, along with 
the value s,, for each sequence. This immediately gives the null distribution of S$, 
displayed in Table 15.2. For example, Table 15.1 shows that three of the 32 possible 
sequences have s, = 8, so P(S, = 8) = 1/32 + 1/32 + 1/32 = 3/32. Notice that 
the null distribution is symmetric about 7.5 [more generally, symmetrically distributed 


Table 15.1 Possible Signed-Rank Sequences for n = 5 


Sequence Sy Sequence Sa 
1 2 3 4 5 0 1 2 3 4 5 4 
1 2 3 4 5 fl 1 2 3 4 5 5 
1 2 3 4 5 2 1 2, 3 4 5 6 
1 2 3 4 5 3 1 2 3 4 5 7 
1 2 5 4 5 3 1 2 3 4 5 yi 
1 2 3 4 B 4 1 2 i) 4 5 8 
1 2 3 4 5 5 1 2 3 4 5 9 
1 pi) 5 4 5) 6 1 2 3 4 5 10 
1 2 3 4 5 5 1 2 3 4 5 9 
1 2, 3 4 5) 6 1 2 3 4 5 10 
1 2. 3 4 5 7 1 2 3 4 5 Ti 
1 2 3 4 5 8 1 2 3 4 5 12 
1 2 3 4 5 8 1 2 3 4 > 12 
1 2 3 4 5 9 1 2 3 4 5 13 
1 2 3 4 5 10 1 2; 3 4 5 14 
1 2 3 4 5 11 1 2 3 4 5 15 
Table 15.2 Null Distribution of S, When n = 5 
Si. | 0 1 2 3 4 5 6 7 
P(S4) | 39 39 30 % % 9 9 o 
Sy 8 9 10 11 12 13 14 15 
Ps.) @ z a o 2 2 39 39 
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over the possible values 0, 1, 2,..., 2(n + 1)/2]. This symmetry is important in relat- 
ing the P-value for lower-tailed and two-tailed tests to that of an upper-tailed test. 

For n = 10 there are 2!° = 1024 possible signed-rank sequences, so a listing 
would involve much effort. Each sequence, though, would have probability 1/1024 
when H, is true, from which the distribution of S$, when H, is true can be easily obtained. 

We are now in a position to calculate a P-value for testing Hp: # = 0 versus 
H,: & > 0 when n = 5. Suppose that s, = 13. Then 


P-value = P(S, = 13 when H) is true) 
= PS, = 13, 14, or 15) 
1 1 1 3 
+—+ 
32 
= .094 
If s,=14, then P-value = 2/32 = .063. For the sample x, = .58, x, = 2.50, 
x, = —.21,x, = 1.23, x5 = .97, the signed rank sequence is —1, +2, +3, +4, +5, 
so s, = 14. Thus H, would be rejected at significance level .10 because P-value = 


.063 = .10 = a. However, at significance level .05 or .01, there would not be enough 
evidence to justify rejecting the null hypothesis. 


General Description of the Test 


Because the underlying distribution is assumed symmetric, 4. = (1, so we will state 
the hypotheses of interest in terms of yw rather than p.* 


ASSUMPTION 


X,, X>,..., X,, is arandom sample from a continuous and symmetric probability 
distribution with mean (and median) p. 


When the hypothesized value of yz is fo, the absolute differences |x, — py 
|x,, — Mo] must be ranked from smallest to largest. 


geeey 


Null hypothesis: Hp: w= Mo 
Test statistic value: s, = the sum of the ranks associated with positive 
(x; — Mo)’s 


Alternative Hypothesis 


Hi: b> bo 


Ay: bh < Mo 
Ay: bh ~ Mo 


P-Value Determination 


JAS = 8) 
PS, <5,) = PS, = n(n + 1)/2-s,) 
2P,(S, = max{s,, n(n + 1)/2—s,}) 


Appendix Table A.13 gives P)(S, = c) = P(S, 2 c when H, is true) for 
values of c for which this probability is closest to .1, .05, .025, .01, and .005. 
This allows conclusions to be reached at significance levels that are at least 
approximately .10, .05, and .01. 


Suppose, for example, that the test is upper-tailed and based on n = 10. Table A.13 
shows that P,(S, = 41) = .097 and P,(S, = 44) = .053. So if s, = 40, then the 


* Tf the tails of the distribution are “too heavy,” as was the case with the Cauchy distribution mentioned in 
Chapter 6, then y will not exist. In such cases, the Wilcoxon test will still be valid for tests concerning [. 
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P-value exceeds .10. The value s, = 42 implies that .05 < P-value < .10, allowing 
for rejection of the null hypothesis at significance level .10 but not at significance 
level .05. If s, = 44, it is a really close call at significance level .05. 

In the case of a lower-tailed test based on n = 10, the value s, = 13 results 
in P-value P,(S, <= 13). By symmetry of the null distribution, this is identical to 
P,(S, = 10(11)/2 — 13) = P,(S,, = 42). The P-value is then between .05 and .10. 
If a two-tailed test results in s, = 44 when n = 10, then max{44, 55 — 44} = 44. 
Thus the P-value is 2P,(S, = 44) = 2(.053) = .106. This would also be the P-value 
if s, = 11, since max{11,55 — 11} = 44; the value 11 is just as far out in the lower 
tail of the null distribution as 44 is in the upper tail. 


EXAMPLE 15.1 A manufacturer of electric irons, wishing to test the accuracy of the thermostat con- 
trol at the 500°F setting, instructs a test engineer to obtain actual temperatures at that 
setting for 15 irons using a thermocouple. The resulting measurements are as follows: 


494.6 510.8 487.5 493.2 502.6 485.0 495.9 498.2 
501.6 497.3 492.0 504.3 499.2 493.5 505.8 


The engineer believes it is reasonable to assume that a temperature deviation from 
500° of any particular magnitude is just as likely to be positive as negative (the ass- 
umption of symmetry) but wants to protect against possible nonnormality of the 
actual temperature distribution, so she decides to use the Wilcoxon signed-rank test 
to see whether the data strongly suggests incorrect calibration of the iron. 

The hypotheses are H,: w = 500 versus H,: w # 500, where ps = the true 
average actual temperature at the 500°F setting. Subtracting 500 from each x; gives 


—5.6 108 —-125 -68 2.6 15.0 4.1 13 1.6 =—2.7 


-8.0 43 -—8 —-65 5.8 
The ranks are obtained by ordering these from smallest to largest without regard to 
sign. 
Absolute 
Magnitude |.8 1.6 18 26 2.7 4.1 43 56 5.8 65 68 8.0 10.8 12.5 15.0 
Rank 12 3 4 5 6 7 8 9 10 11 12 13 14° =15 
Sign 


Thus s, =2+4+7+9 +4 13 = 35. With n(n + 1)/2 = 120, the P-value for a 
two-tailed test is 2P)(S, = 35) = 2P,(S, = 85). Appendix Table A.13 shows that 
PS, 2 89) = .053, so P-value > 2(.053) = .106. Even at significance level .10, 
the null hypothesis cannot be rejected, so it certainly cannot be rejected at level .05. 
Software gives .164 as the P-value. There is no reason to question the plausibility of 
500 as the value of the population mean and median. a 


Although a theoretical implication of the continuity of the underlying distribu- 
tion is that ties will not occur, in practice they often do because of the discreteness 
of measuring instruments. If there are several data values with the same absolute 
magnitude, then they would be assigned the average of the ranks they would receive 
if they differed very slightly from one another. For example, if in Example 15.1 
Xz = 498.2 is changed to 498.4, then two different values of (x, — 500) would have 
absolute magnitude 1.6. The ranks to be averaged would be 2 and 3, so each would 
be assigned rank 2.5. 
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Paired Observations 


When the data consisted of pairs (X,, Y,),...,(X,, Y,,) and the differences D, = 
X, — Y,,...,D, =X, — Y, were normally distributed, in Chapter 9 we used a 
paired f test to test hypotheses about the expected difference wp. If normality is not 
assumed, hypotheses about frp can be tested by using the Wilcoxon signed-rank test 
on the D,’s, provided that the distribution of the differences is continuous and sym- 
metric. If X; and Y, both have continuous distributions that differ only with respect to their 
means, then D, will have a continuous symmetric distribution (it is not necessary for the 
X and Y distributions to be symmetric individually). The null hypothesis is Hj: wp = Ao, 
and the test statistic S, is the sum of the ranks associated with the positive (D; — Ay)’s. 


EXAMPLE 15.2 Intermittent fasting (IF) consists of repetitive bouts of short-term fasting. It is of 
potential interest because it may provide a simple tool to improve insulin sensitivity 
in individuals with insulin resistance (the latter increases the likelihood of type 2 dia- 
betes and heart disease). The article “Intermittent Fasting Does Not Affect Whole- 
Body Glucose, Lipid, or Protein Metabolism” (Amer. J. of Clinical Nutr., 2009: 
1244 —1251) reported on a study in which resting energy expenditure (kcal/d) was 
determined for a sample of n = 8 subjects both while on an IF regimen and while on 
a standard diet. The authors of the article kindly provided the following data: 


Subject 1 2 3 4 5: 6 7 8 
IF REE 1753.7 1604.4 1576.5 1279.7 1754.2 1695.5 1700.1. = 1717.0 
Std REE 1755.0 1691.1 1697.1 1477.7 1785.2 1669.7 1901.3. 1735.3 
Difference = 13 —86.7 —120.6 —198.0 —31.0 25.8 —201.2  —18.3 
Signed rank =1 =5 —6 =] —4 3 =8 =2 


The article employed the Wilcoxon signed-rank test on the differences to decide 
whether there is any difference between true average REE for the IF diet and that for 
the standard diet. The relevant hypotheses are 


A: bp = O versus H,: wp ¥ 0 


The test statistic value is clearly s, = 3 (only that signed rank is positive). For a 
two-tailed test, the P-value is 2P,)(S, =< 3). In the case n = 8, there are 28 = 256 pos- 
sible sets of signed-ranks, all of which are equally likely when the null hypothesis is 
true. The signed-rank sets that result in test statistic values as small or smaller than 
the value 3 that came from the data are as follows (only positive signed ranks are 
displayed): 


no positive signed ranks (s, = 0); 1 (s, = 1); 2 (s, = 2); 1,2 (s, = 3); 3 (s, = 3) 


So the P-value is 2(5/256) = .039. The null hypothesis would thus be rejected at 
significance level .05 but not at significance level .01. The article reported only that 
P-value < .05. 


Here is output from the R software package: 


Wilcoxon signed rank test 

data: y 

V =3, p-value = 0.03906 

alternative hypothesis: true location is not equal to 0 
Wilcoxon signed rank test with continuity correction 
data: y 

V =3, p-value = 0.04232 

alternative hypothesis: true location is not equal to 0 
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This latter P-value of .042, which Minitab also reports, is based on the normal approxi- 
mation described in the next subsection along with a continuity correction. a 


A Large-Sample Approximation 


Appendix Table A.13 provides critical values for level a tests only when n = 20. 
For n > 20, it can be shown that S$, has approximately a normal distribution with 
_ nn + 1) ; n(n + 1)(2n + 1) 
| os 24 


when #1) is true. 

The mean and variance result from noting that when H) is true (the symmetric 
distribution is centered at j4)), then the rank 7 is just as likely to receive a + sign as 
it is to receive a — sign. Thus 


S,=W,+W,+Wyt--4+W, 


n 


where 


W 1 with probability .5 
' (0. with probability .5 a 


n_ with probability .5 
0 with probability .5 


(W, = 0 is equivalent to rank 7 being associated with a —, so i does not contribute to S,.) 

S, is then a sum of random variables, and when H) is true, these W,’s can be 
shown to be independent. Application of the rules of expected value and variance 
gives the mean and variance of S_.. Because the W,’s are not identically distributed, 
our version of the Central Limit Theorem cannot be applied, but there is a more 
general version of the theorem that can be used to justify the normality conclusion. 
Putting these results together gives the following large-sample test statistic. 


= S, —n(n + 1)/4 
Vn(n + 1)(2n + 1)/24 


(15.1) 


A P-value is computed using Appendix Table A.3 as it was for z tests in Chapters 8 
and 9. 


EXAMPLE 15.3 A particular type of steel beam has been designed to have a compressive strength 
(Ib/in*) of at least 50,000. For each beam in a sample of 25 beams, the compressive 
strength was determined and is given in Table 15.3. Assuming that actual compres- 
sive strength is distributed symmetrically about the true average value, let’s use the 
Wilcoxon test with a = .01 to decide whether the true average compressive strength 


Table 15.3 Data for Example 15.3 


x; — 50,000 Signed Rank | x; — 50,000 Signed Rank | x; — 50,000 Signed Rank 
—10 —-1 —99 —10 165 +18 
=27 =2 113 +11 —178 =19 

36 +3 =127 =12. —183 —20 
=55 —4 = 129 =13 =192 =21 

73 a5 136 +14 —199 =22 
=71 =6 —150 =15 =212 =23 
—81 =] =1155 —16 =217 —24 

90 +8 =159 =17 =229 —25 
—95 =o 
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is less than the specified value—that is, test Hy: w = 50,000 versus H,: ww < 50,000 
(favoring the claim that average compressive strength is at least 50,000). 

The sum of the positively signed ranks is 3 +5 +8+ 11+ 14+ 18 =59, 
n(n + 1)/4 = 162.5, and n(n + 1)(2n + 1)/24 = 1381.25, so 


59 — 162.5 _ 
1381.25 


The P-value for this lower-tailed test is ®(—2.78) = .0027. Since this is less than 
.O1, Ho is rejected in favor of the conclusion that true average compressive strength 
is less than 50,000. a 


—2.78 


= 
Kj 


When there are ties in the absolute magnitudes, so that average ranks must be 
used, it is still correct to standardize S$, by subtracting n(n + 1)/4, but the following 
corrected formula for variance should be used: 


ge 2 2 ~ 
C= n(n + 1)(2n + 1) re SG; - DGG; + D (15.2) 


where 7; is the number of ties in the ith set of tied values and the sum is over all sets 
of tied values. If, for example, n = 10 and the signed ranks are 1, 2, —4, —4, 4, 6, 7, 
8.5, 8.5, and 10, then there are two tied sets with Tt, = 3 and T, = 2, so the summa- 
tion is (2)(3)(4) + (1)(2)(3) = 30 and of = 96.25 — 30/48 = 95.62. The denomi- 
nator in (15.1) should be replaced by the square root of (15.2), though as this 
example shows, the correction is usually insignificant. 


Efficiency of the Wilcoxon Signed-Rank Test 


When the underlying distribution being sampled is normal, either the ¢ test or the 
signed-rank test can be used to test a hypothesis about w. The f test is the best test 
in such a situation because among all level @ tests it is the one having minimum f. 
Since it is generally agreed that there are many experimental situations in which nor- 
mality can be reasonably assumed, as well as some in which it should not be, there 
are two questions that must be addressed in an attempt to compare the two tests: 


1. When the underlying distribution is normal (the “home ground” of the ¢ test), 
how much is lost by using the signed-rank test? 


2. When the underlying distribution is not normal, can a significant improvement 
be achieved by using the signed-rank test? 


If the Wilcoxon test does not suffer much with respect to the ¢ test on the “home 
ground” of the latter, and performs significantly better than the ¢ test for a large num- 
ber of other distributions, then there will be a strong case for using the Wilcoxon test. 

Unfortunately, there are no simple answers to the two questions. Upon reflec- 
tion, it is not surprising that the ¢ test can perform poorly when the underlying 
distribution has “heavy tails” (i.e., when observed values lying far from y are rela- 
tively more likely than they are when the distribution is normal). This is because the 
behavior of the f test depends on the sample mean, which can be very unstable in the 
presence of heavy tails. The difficulty in producing answers to the two questions is 
that 6 for the Wilcoxon test is very difficult to obtain and study for any underlying 
distribution, and the same can be said for the f test when the distribution is not nor- 
mal. Even if 6 were easily obtained, any measure of efficiency would clearly depend 
on which underlying distribution was postulated. A number of different efficiency 
measures have been proposed by statisticians; one that many statisticians regard as 
credible is called asymptotic relative efficiency (ARE). The ARE of one test with 
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respect to another is essentially the limiting ratio of sample sizes necessary to obtain 
identical error probabilities for the two tests. Thus if the ARE of one test with respect 
to a second equals .5, then when sample sizes are large, twice as large a sample size 
will be required of the first test to perform as well as the second test. Although the 
ARE does not characterize test performance for small sample sizes, the following 


results can be shown to hold: 


1. When the underlying distribution is normal, the ARE of the Wilcoxon test with 
respect to the ¢ test is approximately .95. 


2. For any distribution, the ARE will be at least .86 and for many distributions 


will be much greater than 1. 


We can summarize these results by saying that, in large-sample problems, the 
Wilcoxon test is never very much less efficient than the ¢ test and may be much 
more efficient if the underlying distribution is far from normal. Though the issue is 
far from resolved in the case of sample sizes obtained in most practical problems, 
studies have shown that the Wilcoxon test performs reasonably and is thus a viable 


alternative to the f test. 


EXERCISES Section 15.1 (1-9) 


Give as much information as you can about the P-value 
for the Wilcoxon test in each of the following situations. 
n = 12, upper-tailed test, s, = 56 

= 12, upper-tailed test, s, = 62 

= 12, lower-tailed test, s, = 20 

14, two-tailed test, s, = 21 

25, two-tailed test, s, = 300 


eae rf 
xs SS 3 


Il 


Here again is the data on expense ratio (%) for a sample 
of 20 large-cap blended mutual funds introduced in 
Exercise 1.53: 


1.03 123 1.10 164 130 1.27 1.25 
78 1.05 .64 94 86 1.05 ahd 
09 0.79 1.61 1.26 93 84 


A normal probability plot shows a distinctly nonlinear pat- 
tern, primarily because of the single outlier on each end 
of the data. But a dotplot and boxplot exhibit a reasonable 
amount of symmetry. Assuming a symmetric population 
distribution, does the data provide compelling evidence for 
concluding that the population mean expense ratio exceeds 
1%? Use the Wilcoxon test at significance level .1. [Note: 
The mean expense ratio for the population of all 825 such 
funds is actually 1.08.] 


The accompanying data is a subset of the data reported in 
the article “Synovial Fluid pH, Lactate, Oxygen and 
Carbon Dioxide Partial Pressure in Various Joint 
Diseases” (Arthritis and Rheumatism, 1971: 476-477). 
The observations are pH values of synovial fluid (which 
lubricates joints and tendons) taken from the knees of 
individuals suffering from arthritis. Assuming that true 
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average pH for nonarthritic individuals is 7.39, test at 
level .05 to see whether the data indicates a difference 
between average pH values for arthritic and nonarthritic 
individuals. 


7.002 7.35 7.34 717 7.28 7.77 7.09 
7.22 745 695 740 7.10 7.32 7.14 


A random sample of 15 automobile mechanics certified 
to work on a certain type of car was selected, and the 
time (in minutes) necessary for each one to diagnose a 
particular problem was determined, resulting in the fol- 
lowing data: 


30.6 30.1 15.6 26.7 27.1 25.4 35.0 30.8 
31.9 353.2 12.5. 23,2 8.8 24.9 30.2 


Use the Wilcoxon test at significance level .10 to decide 
whether the data suggests that true average diagnostic 
time is less than 30 minutes. 


Both a gravimetric and a spectrophotometric method are 
under consideration for determining phosphate content of 
a particular material. Twelve samples of the material are 
obtained, each is split in half, and a determination is made 
on each half using one of the two methods, resulting in the 
following data: 


Sample 1 2 3 4 


Gravimetric 54.7 58.5 66.8 46.1 


Spectrophotometric | 55.0 55.7 62.9 45.5 


Sample 5 6 7 8 


Gravimetric 52.3 74.3 92.5 40.2 


Spectrophotometric | 51.1 75.4 89.6 38.4 


Sample 9 10 11 12 


Gravimetric 87.3 74.8 63.2 68.5 


Spectrophotometric | 86.8 72.5 62.3 66.0 


Use the Wilcoxon test to decide whether one technique 
gives on average a different value than the other tech- 
nique for this type of material. 


Reconsider the situation described in Exercise 39 of Section 
9.3, and use the Wilcoxon test to test the appropriate 
hypotheses. 


Use the large-sample version of the Wilcoxon test at 
significance level .05 on the data of Exercise 37 in 
Section 9.3 to decide whether the true mean difference 
between outdoor and indoor concentrations exceeds .20. 


Reconsider the port alcohol content data from Exercise 
14.22. A normal probability plot casts some doubt on the 
assumption of population normality. However, a dotplot 
shows a reasonable amount of symmetry, and the mean, 
median, and 5% trimmed mean are 19.257, 19.200, and 
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19.209, respectively. Use the Wilcoxon test at significance 
level .01 to decide whether there is substantial evidence 
for concluding that true average content exceeds 18.5. 


Suppose that observations X,, X5,..., X,, are made on a 
process at times 1, 2,...,m. On the basis of this data, 
we wish to test 


Hy: the X;’s constitute an independent and identically 
distributed sequence 


versus 


H,: X,,, tends to be larger than X, for i = 1,..., (an 
increasing trend) 


Suppose the X,’s are ranked from 1 to n. Then when 
H, is true, larger ranks tend to occur later in the 
sequence, whereas if H, is true, large and small ranks 
tend to be mixed together. Let R,; be the rank of 
X, and consider the test statistic D = >'_,(R, — i). 
Then small values of D give support to H, (e.g., the 
smallest value is 0 for R, = 10, R, = 2,...,R, =n). 
When H, is true, any sequence of ranks has prob- 
ability 1/n!. Use this to determine the P-value in the 
case n = 4, d = 2. [Hint: List the 4! rank sequences, 
compute d for each one, and then obtain the null 
distribution of D. See the Lehmann book (in the 
chapter bibliography), p. 290, for more information.] 


15.2 The Wilcoxon Rank-Sum Test 


The two-sample f¢ test is based on the assumption that both population distributions 
are normal. There are situations, though, in which an investigator would want to use 
a test that is valid even if the underlying distributions are quite nonnormal. We now 
describe such a test, called the Wilcoxon rank-sum test. An alternative name for 
the procedure is the Mann-Whitney test, though the Mann-Whitney test statistic is 
sometimes expressed in a slightly different form from that of the Wilcoxon test. The 
Wilcoxon test procedure is distribution-free because it will have the desired level of 
significance for a very large class of underlying distributions. 


ASSUMPTIONS 


X,,...,X,, and Y,,..., Y, are two independent random samples from continuous 
distributions with means p, and p,, respectively. The X and Y distributions 
have the same shape and spread, the only possible difference between the two 
being in the values of wz, and p25. 


The null hypothesis Hp: w, — pf, = Ap asserts that, the X distribution is shifted by 
the amount Aj to the right of the Y distribution. 


Development of the Test When m= 3,n=4 


Suppose the relevant hypotheses are Hp: uw; — @, = 0 versus H,: w, — pb, > 0. If 
4, is actually much larger than j2,, then most of the observed x’s will typically fall 
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to the right of the observed y’s. However, if Hp) is true, then the observed values 
from the two samples should be intermingled. The test statistic assesses how much 
intermingling there is in the two samples. 

Consider the case m = 3,n = 4. Then if all three observed x’s were to the 
right of all four observed y’s, this would provide strong evidence for rejecting H) 
in favor of H,. The test procedure involves pooling the X’s and Y’s into a combined 
sample of size m + n = 7 and ranking these observations from smallest to largest, 
with the smallest receiving rank | and the largest rank 7. If most of the largest ranks 
were associated with X observations, we would begin to doubt Hp. The test statistic 
that quantifies this reasoning is 


W = the sum of the ranks in the combined sample 
associated with X observations (13.3) 


For the values of m and n under consideration, the smallest possible value of 
Wisw=1+2+3 =6 (if all three x’s are smaller than all four y’s), and the 
largest possible value is w= 5 +6+ 7 = 18 (if all three x’s are larger than all 
four y’s). 

As an example, suppose x, = —3.10, x, = 1.67, x; = 2.01, y, = 5.27, y, = 
1.89, y, = 3.86, and y, = .19. Then the pooled ordered sample and corresponding 
ranks are as follows: 


Ordered pooled sample: =3:10 19 1.67 1.89 2.01 3.86 5.27 
Sample: x y x y x y y 
Rank: 1 2 3 + 5 6 7 


Thuisw=1+3+5=49. 

For the alternative under consideration, a larger value of W provides more 
evidence against H, than does a smaller value. This implies that the test is upper- 
tailed: P-value = P,(W = w), where again P, denotes the probability computed 
assuming that Hp is true. So we need the “null distribution” of the test statistic. To 
this end, recall that when H, is true, all seven observations come from the same 
population. This means that under Hp, any possible triple of ranks associated with 
the three x’s—such as (1, 4, 5), (3, 5, 6), or (5, 6, 7)—has the same probability as 
any other possible rank triple. Since there are 4 = 35 possible rank triples, under 
H, each rank triple has probability 1/35. From a list of all 35 rank triples and the 
w value associated with each, the probability distribution of W can immediately be 
determined. For example, there are four rank triples that have w value 11—(1, 3, 
7), Cl, 4, 6), (2, 3, 6), and (2, 4, 5)—so P(W = 11) = 4/35. The computations are 
summarized in Table 15.4. 


Table 15.4 Probability Distribution of W(m = 3, n = 4) When HA, Is True 


w | 6 7 8 9 10 11 12 13 14 «+15 «16 = «6«17—~«(18 


1 1 2 3 4 4 5 4 f 3 2 1 1 
3) 35 35 35 33 35° “35 35 35 -35- 3) 35. 35 


P,(W = w) 


The null distribution of Table 15.4 is symmetric about w = (6 + 18)/2 = 12, 
which is the middle value in the ordered list of possible W values. This is because 
the two rank triples (r, s, #) (with r< s < ft) and (8 — t,8 — 5,8 — r) have values 
of w symmetric about 12, so for each triple with w value below 12, there is a triple 
with w value above 12 by the same amount. 

For the alternative under consideration, the test statistic value w = 16 results 
in P-value = P,(W =16) = 2/35 + 1/35 + 1/35 = .114. We would then not be 
able to reject the null hypothesis at significance level .05. The test is lower-tailed if 
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the alternative hypothesis is H,: w, — b, < 0. The test statistic value w = 7 gives 
P-value = P,(W S 7) = 1/35 + 1/35 = .057. In the case of the alternative hypoth- 
esis H,: 4; — b> ~ 0, the test is two-tailed. Suppose, for example, that w = 17. Then 
17 and 18 at least as contradictory to H, as the obtained value of W, and so also are 
7 and 6; these latter two values are as far out in the lower tail of the null distribu- 
tion as the former two are in the upper tail. The P-value is then P,(W = 17 or S 7) 
= 1/35 + 1/35 + 1/35 + 1/35 = .114. 


General Description of the Test 


The null hypothesis Hp: “, — f, = Ap is handled by subtracting Ay from each X; and 
using the (X, — Aj)’s as the X,’s were previously used. Note that for any positive inte- 
ger K, the sum of the first K integers is K(K + 1)/2. This implies that the smallest pos- 
sible value of the statistic Wis m(m + 1)/2, which occurs when the (X; — Aj)s are all 
to the left of the Y sample. The largest possible value of W occurs when the (X, — Aj)’s 
lie entirely to the right of the Y’s; in this case, W = (n + 1) + --- + (m+n) = sum 
of first m + n integers) — (sum of first n integers), which gives m(m + 2n + 1)/2. 
As with the special case m = 3,n = 4, the distribution of W is symmetric about 
the value that is halfway between the smallest and largest values; this middle value is 
mim + n + 1)/2. Because of this symmetry, P-values for lower-tailed and two-tailed 
tests are easily obtained from a tabulation of upper-tailed null distribution probabilities. 


Null hypothesis: Hj: w, — b, = Ay 


m 


Test statistic value: w= /_,r; where r;= rank of (x; — Ao) in the com- 
bined sample of m + n (x — Aj)’s and y’s 


Alternative Hypothesis P-Value Determination 

A: My — By > Ag P.(W = w) 

abe (i, — Ving SIN P(W=w) = P(W= mm +n 1)—w) 
Ay ae Boe AG 2P,)(W = max{w, m(m + n + 1) — w}) 


Appendix Table A.14 gives P)(W = c) = P(W = c when H) is true) for values 
of c for which this probability is closest to .05, .025, .01, and .005. This allows 
conclusions to be reached at significance levels that are at least approximately 
.O5 and .01. 


The table gives information only for m = 3, 4,...,8 andn =m,m+t 1,...,8 
(Le., 3 =m =n S 8). For values of m and n that exceed 8, a normal approxima- 
tion can be used; of course statistical software will provide an exact P-value for 
any sample size. To use the table for small m and n, though, the X and Y samples 
should be labeled so that m = n. 


EXAMPLE 15.4 Noroviruses are a leading cause of acute gastroenteritis, and as of December 2011, 
no vaccine was available to combat this virus. An experiment involved 15 patients, 
8 of whom were randomly assigned to receive a new vaccine; the other 7 individu- 
als received a placebo. The following data on duration (h) of Norwalk virus illness 


resulted: 
Vaccine: 1.0 6.2 9.2 13.4 22.1 36.1 40.2 63.5 
Placebo: aa 16.6 22.0 38.2 45.8 81.5 107.0 


Does it appear that true average duration is different for the two treatments? 
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Since use of Appendix Table A.14 requires m = n, let 1, denote true average 
duration when a placebo is given and mw, represent true average duration when the 
vaccine is administered. 

We wish to test 


Hy: @, — By = O versus H,: bw, — bw, ~ 0 


using the Wilcoxon rank-sum test at significance level .05. The x (placebo) ranks for the 
pooled sample consisting of all 15 observations are 2, 6, 7, 10, 12, 14, and 15, from which 
w=2+-:: + 15 = 66. Since max{w, m(m + n+ 1) — w} = max{66, 46} = 66, 


P-value = 2P,(W = 66) > 2P,(W = 71) = 2(.047) = .094 


(P,(W = 71) comes from Table A.14). The P-value clearly exceeds .05, so at this sig- 
nificance level the null hypothesis cannot be rejected. There is not enough evidence to 
conclude that true average duration is different for the vaccine from what it is for the 
placebo. Software yields a P-value of .27, which is in excellent agreement with the value 
.28 based on larger sample sizes reported in the article ‘‘Norovirus Vaccine against 
Experimental Human Norwalk Virus IlIness” (New England J. of Medicine, 2011: 
2178-2187). A large number of articles in this journal include summaries of data analy- 
ses using the Wilcoxon signed-rank test, rank-sum test, or both. a 


Theoretically, the assumption of continuity of the two distributions ensures that 
all m + n observed x’s and y’s will have different values. In practice, though, there 
will often be ties in the observed values. As with the Wilcoxon signed-rank test, the 
common practice in dealing with ties is to assign each of the tied observations in a 
particular set of ties the average of the ranks they would receive if they differed very 
slightly from one another. 


A Normal Approximation for W 


When both m and n exceed 8, the distribution of W can be approximated by an 
appropriate normal curve, and this approximation can be used in place of Appendix 
Table A.14. To obtain the approximation, we need 4, and oj, when H, is true. In 
this case, the rank R; of X, — Ap is equally likely to be any one of the possible values 
1, 2,3, ...,m + n (R; has a discrete uniform distribution on the first m + n positive 
integers), So Wy = (m +n + 1)/2. Since W = =R, this gives 


_ mmt+n+t i) 


, (15.4) 


My — Pr, a Mr, AS ale MR, 
The variance of R; is also easily computed to be (m+n + 1)(m+n-— 1)/12. 
However, because the R,’s are not independent variables, V(W) ¥ mV(R;,). Using 
the fact that, for any two distinct integers a and b between | and m + n inclusive, 


P(R, = a, R, = b) = 1/[(m + n)(m + n — 1)] (two integers are being sampled with- 
out replacement), Cov(R;, R)) = —(m +n + 1) /12, which yields 
minim +n + 1) 


ow = VR) + > Dd Cov(R; Ri) = o (15.5) 
i=l ixj 


A Central Limit Theorem can then be used to conclude that when H, is true, 
the test statistic 
W-m(m+n-+ 1)/2 


7 \/mn(m +n+1)/12 


has approximately a standard normal distribution. A P-value is computed using 
Appendix Table A.3 exactly as in the case of previous z tests. 


Z 
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EXAMPLE 15.5 The article “‘A Lining for the Thermal Comfort of Trekking Boots—Experimental 
and Numerical Studies” (Research J. of Textile and Apparel, 2011: 50-61) dis- 
cussed the design and development of linings in footwear. The investigators used the 
rank-sum test to decide whether true average moisture accumulation (g) was differ- 
ent for two types of boots. The sample sizes and value of W were not provided, so 
suppose that m = n = 9 and w = 37. Then 


My = 9(19)/2 = 85.5, oy = V9(9)09)/12 = V128.25 = 11.325 


from which z = (37 — 85.5)/11.325 = —4.28 (this is the value of z given in the 
cited article). The P-value is 2@(—4.28), which is less than 20(—3.49) = .0004. At 
significance level .01 or even .001, we reject the null hypothesis and conclude that 
true average moisture accumulation is different for the two types of boots. a 


If there are ties in the data, the numerator of Z is still appropriate, but the 
denominator should be replaced by the square root of the adjusted variance 
> _ mam + n+ 1) 
ss 12 


mn 

7, — W(r)(7, + 1 15.6 

12(m + n)\(m +n — 1) > ; Mada ) ( ) 

where 7; is the number of tied observations in the ith set of ties and the sum is over 

all sets of ties. Unless there are a great many ties, there is little difference between 
Equations (15.6) and (15.5). 


Efficiency of the Wilcoxon Rank-Sum Test 


When the distributions being sampled are both normal with o, = o,, and therefore 
have the same shapes and spreads, either the pooled ¢ test or the Wilcoxon test can be 
used (the two-sample f test assumes normality but not equal variances, so assumptions 
underlying its use are more restrictive in one sense and less in another than those for 
Wilcoxon’s test). In this situation, the pooled f test is best among all possible tests in the 
sense of minimizing B for any fixed a. However, an investigator can never be absolutely 
certain that underlying assumptions are satisfied. It is therefore relevant to ask (1) how 
much is lost by using Wilcoxon’s test rather than the pooled ¢ test when the distributions 
are normal with equal variances and (2) how W compares to T in nonnormal situations. 

The notion of test efficiency was discussed in the previous section in connection 
with the one-sample f test and Wilcoxon signed-rank test. The results for the two- 
sample tests are the same as those for the one-sample tests. When normality and equal 
variances both hold, the rank-sum test is approximately 95% as efficient as the pooled 
t test in large samples. That is, the f test will give the same error probabilities as the 
Wilcoxon test using slightly smaller sample sizes. On the other hand, the Wilcoxon test 
will always be at least 86% as efficient as the pooled f test and may be much more effi- 
cient if the underlying distributions are very nonnormal, especially with heavy tails. The 
comparison of the Wilcoxon test with the two-sample (unpooled) f test is less clear-cut. 
The ¢ test is not known to be the best test in any sense, so it seems safe to conclude that 
as long as the population distributions have similar shapes and spreads, the behavior of 
the Wilcoxon test should compare quite favorably to the two-sample tf test. 

Lastly, we note that B calculations for the Wilcoxon test are quite difficult. 
This is because the distribution of W when H) is false depends not only on p, — pb, 
but also on the shapes of the two distributions. For most underlying distributions, 
the nonnull distribution of W is virtually intractable. This is why statisticians have 
developed large-sample (asymptotic relative) efficiency as a means of comparing 
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tests. With the capabilities of modern-day computer software, another approach to 
calculation of B is to carry out a simulation experiment. 


EXERCISES Section 15.2 (10-16) 


10. Say as much as you can about the P-value for the rank- 
sum test in each of the following situations. 
a. m=5,n = 6, w = 41, upper-tailed test. 
b. m=5,n = 6, w = 22, lower-tailed test. 
c. m=5,n = 6, w = 45, two-tailed test. 
d. m =n = 12, upper-tailed test, x ranks = 4, 7, 8, 11, 
12, 15, 17, 19, 20, 22, 23, 24. 


11. In an experiment to compare the bond strength of two 
different adhesives, each adhesive was used in five 
bondings of two surfaces, and the force necessary to 
separate the surfaces was determined for each bonding. 
For adhesive 1, the resulting values were 229, 286, 245, 
299, and 250, whereas the adhesive 2 observations were 
213, 179, 163, 247, and 225. Let pw; denote the true 
average bond strength of adhesive type i. Use the 
Wilcoxon rank-sum test at level .05 to test Hp: w, = bo 
versus H,: fh, > My. 


12. The article “A Study of Wood Stove Particulate 
Emissions” (J. of the Air Pollution Control Assoc., 1979: 
724-728) reports the following data on burn time (hours) 
for samples of oak and pine. Test at level .05 to see 
whether there is any difference in true average burn time 
for the two types of wood. 


Oak 1.72 .67 1.55 1.56 1.42 1.23 1.77 48 
Pine 98 1.40 1.33 1.52 .73 1.20 


13. The urinary fluoride concentration (parts per million) 
was measured both for a sample of livestock grazing in 
an area previously exposed to fluoride pollution and for 
a similar sample grazing in an unpolluted region: 


Polluted | 21.3 18.7 23.0 17.1 16.8 20.9 19.7 
Unpolluted | 14.2 18.3 17.2 18.4 20.0 


Does the data indicate strongly that the true average fluo- 
ride concentration for livestock grazing in the polluted 
region is larger than for the unpolluted region? Use the 
Wilcoxon rank-sum test at level a = .O1. 


14. The article “Multimodal Versus Unimodal Instruction 
in a Complex Learning Environment” (J. of 
Experimental Educ., 2002: 215-239) described an exper- 
iment carried out to compare students’ mastery of certain 
software learned in two different ways. The first learning 
method (multimodal instruction) involved the use of a 
visual manual. The second technique (unimodal instruc- 
tion) employed a textual manual. Here are exam scores for 
the two groups at the end of the experiment (assignment to 
the groups was random): 
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Method 1: 44.85 46.59 47.60 51.08 52.20 
56.87 57.03 57.07 60.35 60.82 
67.30 70.15 70.77 75.21 75.28 
76.60 80.30 81.23 

Method 2: 51.95 56.54 57.40 57.60 61.16 
39.91 42.01 43.58 48.83 49.07 
49.48 49.57 49.63 50.75 64.55 
65.31 68.59 72.40 


Does the data suggest that the true average score depends 
on which learning method is used? 


15. The article ‘Measuring the Exposure of Infants to 
Tobacco Smoke” (New England J. of Medicine, 1984: 
1075-1078) reports on a study in which various mea- 
surements were taken both from a random sample of 
infants who had been exposed to household smoke and 
from a sample of unexposed infants. The accompanying 
data consists of observations on urinary concentration of 
cotanine, a major metabolite of nicotine (the values con- 
stitute a subset of the original data and were read from a 
plot that appeared in the article). Does the data suggest 
that true average cotanine level is higher in exposed 
infants than in unexposed infants by more than 25? Carry 
out a test at significance level .05. 


Unexposed 8 11 12 14 20 43 III 
Exposed 35 56 83 92 128 150 176 208 


16. Reconsider the situation described in Exercise 81 of 
Chapter 9 and the accompanying Minitab output (the 
Greek letter eta is used to denote a median). 


Mann-Whitney Confidence Interval and Test 
good N=8 Median=0.540 


poor N=8 Median=2.400 
Point estimate for ETA1-ETA2 is =1,155 
95.9 Percent CI for ETA1-ETA2 is (—3.160, —0.409) 
W = 41.0 


Test of ETA1l = ETA2 vs ETA1 < ETA2 is 
significant at 0.0027 


a. Verify that the value of Minitab’s test statistic is 
correct. 

b. Carry out an appropriate test of hypotheses using 
a significance level of .01. 
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15.3 Distribution-Free Confidence Intervals 


The method we have used so far to construct a confidence interval (CI) can be described 
as follows: Start with a random variable (Z, T, x”, F, or the like) that depends on the 
parameter of interest and a probability statement involving the variable, manipulate 
the inequalities of the statement to isolate the parameter between random endpoints, 
and, finally, substitute computed values for random variables. Another general method 
for obtaining CIs takes advantage of the relationship between test procedures and Cls 
discussed in Section 8.5. A 100(1 — a)% CI for a parameter @ can be obtained from a 
level a test for Hj: @ = 8, versus H,: 6 ~ 6). This method will be used to derive inter- 
vals associated with the Wilcoxon signed-rank test and the Wilcoxon rank-sum test. 


PROPOSITION Suppose we have a level a test procedure for testing Hy: 0 = 6) versus 
H,: 0 A 0). For fixed sample values, let A denote the set of all values 6) for 
which H) is not rejected. Then A is a 100(1 — a)% CI for 6. 


This makes intuitive sense because the CI consists of all values of the parameter 
that are plausible at the selected confidence level, and we do not want to reject H 
in favor of H, if 0 is a plausible value. 

There are actually pathological examples in which the set A defined in the 
proposition is not an interval of 6 values, but instead the complement of an interval 
or something even stranger. To be more precise, we should really replace the notion 
of a CI with that of a confidence set. In the cases of interest here, the set A does turn 
out to be an interval. 


The Wilcoxon Signed-Rank Interval 


To test Hp: w= My versus H,: w A My using the Wilcoxon signed-rank test, 
where w is the mean of a continuous symmetric distribution, the absolute values 
Lei = dl X, — Mo| are ordered from smallest to largest, with the smallest 
receiving rank | and the largest rank n. Each rank is then given the sign of its 
associated x; — to, and the test statistic is the sum of the positively signed ranks. 
The two-tailed test rejects Hp if s, is either =c or Sn(n + 1)/2 —c, where c 
is obtained from Appendix Table A.13 once the desired level of significance a is 
specified. For fixed x,,...,x,, the 100(1 — a)% signed-rank interval will consist 
of all to for which Hp: w = fy is not rejected at level a. To identify this interval, 
it is convenient to express the test statistic S, in another form. 


ery 


S, = the number of pairwise averages (X, + X)/ 2 with i Sj that 
are = Mo 


That is, if we average each x, in the list with each x; to its left, including 
(x; + x)/ 2 (which is just X;), and count the number of these averages that are 
= py, S, results. In moving from left to right in the list of sample values, we 
are simply averaging every pair of observations in the sample [again including 
(x; + x)/ 2] exactly once, so the order in which the observations are listed before 
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averaging is not important. The equivalence of the two methods for computing 
s, is not difficult to verify. The number of pairwise averages is (Gy +n (the first 
term due to averaging of different observations and the second due to averaging 
each x, with itself), which equals n(n + 1)/2. It can be shown that P-value = a 
if and only if either too many or too few of these pairwise averages are = [o, in 


which case Hy is rejected. 


EXAMPLE 15.6 The following observations are values of cerebral metabolic rate for rhesus monkeys: 
x, =451, x, = 4.59, x, = 4.90, x, = 4.93, x; = 6.80, x, = 5.08, x, = 5.67. 
The 28 pairwise averages are, in increasing order, 


451 455 459 4705 4.72 4.745 4.76 4.795 4.835 4.90 
4.915 493 4.99 5.005 5.08 5.09 5.13 5.285 5.30 5.375 
5.655 5.67 5.695 5.85 5.865 5.94 6.235 6.80 


The first few and the last few of these are pictured in Figure 15.2. 


S, =27 = 
+ S, = 26 4 
54 = 28 3=5, =25 Ss, =2 ve 
arr 
pe—e—et }e-e-0-0-e|— - ---- }—e-e—e—_} a 
45 46 47 48 5.5 5.75 6 


: At level .046, Hp is accepted for (9 in here. 


Figure 15.2 Plot of the data for Example 15.6 


Because S., is a discrete rv, a = .05 cannot be obtained exactly. Appendix Table 
A.13 shows that the P-value for a two-tailed test is 2(.023) = .046 if either s, = 26 
or 2. Thus H) will not be rejected at significance level .046 if 3 = s, = 25. That is, 
if the number of pairwise averages = py is between 3 and 25, inclusive, H, is not 
rejected. From Figure 15.2 the CI for jz with confidence level 95.4% (approximately 
95%) is (4.59, 5.94). a 


In general, once the pairwise averages are ordered from smallest to largest, the 
endpoints of the Wilcoxon interval are two of the “extreme” averages. To express 
this precisely, let the smallest pairwise average be denoted by x,,), the next smallest 
by X,),..., and the largest by X( (4 1)/2)- 


PROPOSITION If the level a Wilcoxon signed-rank test for Hp: uw = fy versus H,: w #~ py is 
to reject H, if either s, =c ors, Sn(n + 1)/2 —c, then a 100(1 — a)% CI 
for pis 


int y/2-c+ 1 Xo) (Glsi7)) 


In words, the interval extends from the dth smallest pairwise average to the dth larg- 
est average, where d = n(n + 1)/2 — c + 1. Appendix Table A.15 gives the values 
of c that correspond to approximately the usual confidence levels forn = 5, 6,..., 25. 
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EXAMPLE 15.7 For n = 7, the P-value for a two-tailed test is 2(.055) = .11 ifs, = 24 ors, = 4. 
(Example 15.6 — Therefore the null hypothesis will be rejected at significance level .11 if s, = 0, 1, 2, 
continued) 3, 4, 24, 25, 26, 27, or 28. Thus an 89.0% interval (approximately 90%) is obtained 
by using c = 24. The interval is (X98 441)» Xo4)) = (%(5)s X(o4)) = (4.72, 5.85), which 
extends from the fifth smallest to the fifth largest pairwise average. ia 


The derivation of the interval depended on having a single sample from a con- 
tinuous symmetric distribution with mean (median) jz. When the data is paired, the 
interval constructed from the differences d,, d,,..., d,, is a CI for the mean (median) 
difference wp. In this case, the symmetry of X and Y distributions need not be assumed; 
as long as the X and Y distributions have the same shape, the X — Y distribution will 
be symmetric, so only continuity is required. 

For n> 20, the large-sample approximation to the Wilcoxon test based 
on standardizing S$, gives an approximation to c in (15.7). The result [for a 
100(1 — a@)% interval] is 


n(n + 1) n(n + 1)(2n + 1) 
4 *a/2 4 


The efficiency of the Wilcoxon interval relative to the ¢ interval is roughly the 
same as that for the Wilcoxon test relative to the ¢ test. In particular, for large sam- 
ples when the underlying population is normal, the Wilcoxon interval will tend to be 
slightly wider than the f¢ interval, but if the population is quite nonnormal (symmetric 
but with heavy tails), then the Wilcoxon interval will tend to be much narrower than 
the ¢ interval. 


The Wilcoxon Rank-Sum Interval 


The Wilcoxon rank-sum test for testing Hy: 4, — @, = Ay is carried out by first 
combining the (X, — Aj)’s and Y;’s into one sample of size m + n and ranking them 
from smallest (rank 1) to largest (rank m + n). The test statistic W is then the sum 
of the ranks of the (X; — Ao)’s. For the two-sided alternative, H, is rejected if w is 
either too small or too large. 

To obtain the associated CI for fixed x,’s and y,'S, we must determine the 
set of all A, values for which H, is not rejected. This is easiest to do if the test sta- 
tistic is expressed in a slightly different form. The smallest possible value of W is 
m(m + 1)/2, corresponding to every (X; — Aj) less than every Y,, and there are mn 
differences of the form (X; — Aj) — Y;. A bit of manipulation gives 


m(m + 1) 
W = [number of (X; — Y, — Ap)’s = 0] + ——_—— 
4 2 
(15.8) 
m(m + 1) 
= [number of (X; — Y;)’s = Ag] + = 


The P-value will be at most a, leading to rejection of the null hypothesis, if w is 
relatively small (close to 0) or large (close to m(m + 2n + 1)/2). This is equivalent 
to rejecting H, if the number of (x; — y;)’s = Ap is either too small or too large. 

Expression (15.8) suggests that we compute x; — y,; for each i and j and order 
these mn differences from smallest to largest. Then if the null value Aj is neither 
smaller than most of the differences nor larger than most, Hy: ww; — fb. = Ao is not 
rejected. Varying A) now shows that a CI for w, — > will have as its lower endpoint 
one of the ordered (x; — y's, and similarly for the upper endpoint. 
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PROPOSITION Let x,,...,x,, and y,,..., y, be the observed values in two independent samples 
from continuous distributions that differ only in location (and not in shape). 
With d;, = x; — y; and the ordered differences denoted by dji1), dji2),--++ Cijmny> 
the general form of a 100(1 — a)% CI for w, — p, is 


(Ginn —e+ 1)? diy) (US) 


where c is the critical constant for the two-tailed level a Wilcoxon rank-sum test. 


Notice that the form of the Wilcoxon rank-sum interval (15.9) is very similar to 
the Wilcoxon signed-rank interval (15.7); that uses pairwise averages from a sin- 
gle sample, whereas (15.9) uses pairwise differences from two samples. Appendix 
Table A.16 gives values of c for selected values of m and n. 


EXAMPLE 15.8 The article “Some Mechanical Properties of Impregnated Bark Board” (Forest 
Products J., 1977: 31-38) reports the following data on maximum crushing strength 
(psi) for a sample of epoxy-impregnated bark board and for a sample of bark board 
impregnated with another polymer: 


Epoxy (x’s) 10,860 11,120 11,340 12,130 14,380 — 13,070 
Other (y’s) 4590 4850 6510 5640 6390 


Let’s obtain a 95% CI for the true average difference in crushing strength between 
the epoxy-impregnated board and the other type of board. 

From Appendix Table A.16, since the smaller sample size is 5 and the larger 
sample size is 6, c = 26 for a confidence level of approximately 95%. The d;;’s 
appear in Table 15.5. The five smallest d;,’s [diqyy-+- dis) are 4350, 4470, 4610, 
4730, and 4830; and the five largest d,s are (in descending order) 9790, 9530, 8740, 
8480, and 8220. Thus the CI is (dius). dir6)) = (4830, 8220). 


Table 15.5 Differences for the Rank-Sum Interval in Example 15.8 


Jj 

d; 4590 4850 5640 6390 6510 

10,860 6270 6010 5220 4470 4350 

11,120 6530 6270 5480 4730 4610 

x; 11,340 6750 6490 5700 4950 4830 
12,130 7540 7280 6490 5740 5620 

13,070 8480 8220 7430 6680 6560 

14,380 9790 9530 8740 7990 7870 


When m and n are both large, the Wilcoxon test statistic has approximately a 
normal distribution. This can be used to derive a large-sample approximation for the 
value c in interval (15.9). The result is 


mn mn(m + n+ 1) 


= — + 
c™ “9 7 Fal 12 
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As with the signed-rank interval, the rank-sum interval (15.9) is quite efficient 
with respect to the ¢ interval; in large samples, it will tend to be only a bit wider than 
the ¢ interval when the underlying populations are normal and may be considerably 
narrower than the ¢ interval if the underlying populations have heavier tails than do 
normal populations. 


EXERCISES Section 15.3 (17-22) 


17. The article “The Lead Content and Acidity of Christ- 
church Precipitation’ (NV. Zeal. J. of Science, 1980: 311- 
312) reports the accompanying data on lead concentration 
(ug/L) in samples gathered during eight different summer 


Calculate a CI using a confidence level of roughly 95% 
for the difference between the true average amount 
extracted using the first solvent and the true average 
amount extracted using the second solvent. 


rainfalls: 17.0, 21.4, 30.6, 5.0, 12.2, 11.8, 17.3, and 18.8. 20 
Assuming that the lead-content distribution is symmetric, use 
the Wilcoxon signed-rank interval to obtain a 95% CI for pu. 


The following observations are amounts of hydrocarbon 
emissions resulting from road wear of bias-belted tires 
under a 522 kg load inflated at 228 kPa and driven at 
64 km/hr for 6 hours (“Characterization of Tire 
Emissions Using an Indoor Test Facility,’ Rubber 
Chemistry and Technology, 1978: 7-25): .045, .117, 
.062, and .072. What confidence levels are achievable 
for this sample size using the signed-rank interval? 
Select an appropriate confidence level and compute the 
interval. 


18. Compute the 99% signed-rank interval for true average 
pH wp (assuming symmetry) using the data in Exercise 
15.3. [Hint: Try to compute only those pairwise averages 
having relatively small or large values (rather than all 
105 averages).] 


19. An experiment was carried out to compare the abilities 
of two different solvents to extract creosote impregnated 
in test logs. Each of eight logs was divided into two seg- 
ments, and then one segment was randomly selected for 


21. Compute the 90% rank-sum CI for w, — p, using the 
data in Exercise 11. 


application of the first solvent, with the other segment 22. Compute a 99% CI for mw, — m, using the data in 
receiving the second solvent. Exercise 12. 

Log 1 2 3 4 5 6 7 8 

Solvent 1 3.92 3.79 3.70 4.08 3.87 3.95 3.55 3.76 


Solvent2 4.25 4.20 4.41 3.89 4.39 3.75 4.20 3.90 


15.4 Distribution-Free ANOVA 


The single-factor ANOVA model of Chapter 10 for comparing / population or treat- 
ment means assumed that for i = 1, 2,..., 7, a random sample of size J; was drawn 
from a normal population with mean ys, and variance a”. This can be written as 


Lee FS tii lad (15.10) 


where the €,’s are independent and normally distributed with mean zero and vari- 
ance o”. Although the normality assumption was required for the validity of the F 
test described in Chapter 10, the next procedure for testing equality of the p,’s 
requires only that the €;;’s have the same continuous distribution. 


The Kruskal-Wallis Test 


Let N = XJ, the total number of observations in the data set, and suppose we 
rank all N observations from 1 (the smallest X;;) to N (the largest X;;). When 
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Hy: by = By = +++ = p, is true, the N observations all come from the same distribu- 
tion, in which case all possible assignments of the ranks 1, 2,..., N to the J samples are 
equally likely and we expect ranks to be intermingled in these samples. If, however, Hp 
is false, then some samples will consist mostly of observations having small ranks in 
the combined sample, whereas others will consist mostly of observations having large 
ranks. More specifically, if R;, denotes the rank of X;, among the N observations, and 
R,. and R;. denote, respectively, the total and average of the ranks in the ith sample, 
then when H) is true, 


N+1 L 


E(R;) = 2 


The Kruskal-Wallis test statistic is a measure of the extent to which the R,.’s deviate 
from their common expected value (N + 1)/2. 


TEST STATISTIC iy ee et? 
Cah 
NN + DX 2 
; (15.11) 
i S= 3(N + 1) 
NV+ D&S, 


The second expression for K is the computational formula; it involves the rank totals 
(R,.s) rather than the averages and requires only one subtraction. 

Values of K at least as contradictory to H, as the calculated k are those that 
equal or exceed k. That is, the test is upper-tailed: P-value = P,(K = k). Under Hp, 
each possible assignment of the ranks to the J samples is equally likely, so in 
theory all such assignments can be enumerated, the value of K determined for 
each one, and the null distribution obtained by counting the number of times each 
value of K occurs. Clearly, this computation is tedious, so even though there are 
tables of the exact null distribution and critical values for small values of the J;’s, 
we will use the following “large-sample” approximation. 


PROPOSITION When H) is true and either 
ae = (i = 1, 2, 3) 


or 
a et 


then K has approximately a chi-squared distribution with J — 1 df. This implies 
that the approximate P-value is the area under the y7_, curve to the right of k. 
Appendix Table A.11 gives a tabulation of chi-squared upper-tail curve areas. 


EXAMPLE 15.9 The accompanying observations (Table 15.6) on axial stiffness index resulted from 
a study of metal-plate connected trusses in which five different plate lengths—4 in., 
6 in., 8 in., 10 in., and 12 in——were used (“Modeling Joints Made with Light- 
Gauge Metal Connector Plates,” Forest Products J., 1979: 39-44). 
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Table 15.6 Data and Ranks for Example 15.9 


i=1(4”): 309.2 309.7. 311.0 316.8 326.5 349.8 409.5 
i = 2 (6"): 331.0 347.2 348.9 361.0 381.7 402.1 404.5 
i = 3 (8"): 351.0 357.1 366.2 367.3 382.0 392.4 409.9 
i=4(10"): 346.7 3626 3842 4106 433.1 452.9 461.4 
i=5(12”): 4074 410.7 419.9 441.2 441.8 465.8 473.4 


i. r; 

t= 1; 1 2 3 4 5 10 24 49 7.00 

i=2: 6 8 9 13 17 21 22 96 13.71 

Ranks t= 3: 11 12 15 16 18 20 25 117 16.71 
i=4: 7 14 19 26 29 32 33 160 22.86 


t= 5: 23 27 28 30 31 34 35 208 29.71 


The computed value of K is 


12 (49)? (962 (117) : (160) : (208)2 
~ 35(36) | 7 q 7 q q 
= 20.21 


3(36) 


Appendix Table A.11 shows that the area under the 4 df chi-squared curve to the right 
of 16.74 is .005 and the area under this curve to the right of 20.51 is .001. So the 
P-value for the test is slightly larger than .001 but much smaller than .005, and thus 
smaller than .01. Therefore H is rejected at significance level .01, and we conclude 
that expected axial stiffness does depend on plate length. ei 


Friedman's Test for a Randomized Block 
Experiment 


Suppose X;, = w + a; + B; + €,;, where a; is the ith treatment effect, B; is the jth 
block effect, and the e;;’s are drawn independently from the same continuous (but not 
necessarily normal) distribution. Then to test Hy: a, = a, = ++: = a, = 0, the null 
hypothesis of no treatment effects, the observations are first ranked separately from 
1 to 7 within each block, and then the rank average r,, is computed for each of the 
I treatments. When H) is true, the 7,.’s should be close to one another, since within 
each block all /! assignments of ranks to treatments are equally likely. Friedman’s 
test statistic measures the discrepancy between the expected value (J + 1)/2 of each 
rank average and the 7;.’s. 


Dae 
Sice +p 


i=1 


TEST STATISTIC lf, 


I+1 12 ie 
(R.- ale a+ par Bud = 1) 


The test is again upper-tailed, because any value exceeding the calculated f, is even 
more contradictory to H, than is f, itself. For the cases J = 3, J = 2,..., 15 and 
I= 4, J =2,..., 8, Lehmann’s book (see the chapter bibliography) gives the upper- 
tail critical values from which P-value information can be obtained. Alternatively, 
for even moderate values of J, the test statistic F, has approximately a chi-squared 
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distribution with J — 1 df when H) is true, so the approximate P-value is the area 
under the x7_, curve to the right of f. 


EXAMPLE 15.10 The article “Physiological Effects During Hypnotically Requested Emotions” 
(Psychosomatic Med., 1963: 334-343) reports the following data (Table 15.7) on 
skin potential (mV) when the emotions of fear, happiness, depression, and calmness 
were requested from each of eight subjects. 


Table 15.7 Data and Ranks for Example 15.10 


Blocks (Subjects) 
Xs 1 2 3 4 5 6 7 8 


y 


Fear 23.1 576 105 236 11.9 546 21.0 20.3 
Happiness 224 53.2 97 196 13.8 47.1 13.6 23.6 
Depression | 22.5 53.7 10.8 21.1 13.7 39.2 13.7 16.3 
Calmness 22.6 53.1 83 216 133 370 148 148 


Ranks 1 2 3 4 5 6 7 8 r;, r? 
Fear 4 4 3 4 1 4 4 3 27-729 
Happiness 3 2 2 i! 4 3 1 4 20 400 
Depression 1 3 4 2 3 2 2 2 19 361 
Calmness 2 1 1 3 2 1 3 1 14 196 

1686 
Thus 
f= (1686) — 3(8)(5) = 6.45 
4(8)(5) 


The v = 3 column of Appendix Table A.11 shows that P-value ~ .09. Since this 
exceeds .05, Hy cannot be rejected at that significance level. There is no evidence 
that average skin potential depends on which emotion is requested. a 


The book by Myles Hollander et. al. (see the chapter bibliography) discusses 


multiple comparisons procedures associated with the Kruskal-Wallis and Friedman 
tests, as well as other aspects of distribution-free ANOVA. 


EXERCISES Section 15.4 (23-27) 


23. The accompanying data refers to concentration of the Test at level .10 to see whether true average strontium-90 
radioactive isotope strontium-90 in milk samples concentration differs for at least two of the regions. 
obtained from five randomly selected dairies in each of 24. The article “Production of Gaseous Nitrogen in 


four different regions. Human Steady-State Conditions” (J. of Applied 


1 6.4 5.8 6.5 Ta 6.1 Physiology, 1972: 155-159) reports the following 
Region 2 gia 9.9 11.2 10.5 8.8 observations on the amount of nitrogen expired (in 
3 5.7 5.9 8.2 6.6 5.1 liters) under four dietary regimens: (1) fasting, (2) 23% 
4 95 12.1 10.3 12.4 11.7 protein, (3) 32% protein, and (4) 67% protein. Use the 
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25. 


26. 


Kruskal-Wallis test at level .05 to test equality of the 
corresponding y,’s. 


1. 4.079 4859 3.540 5.047 3.298 
2. 4368 5.668 3.752 5.848 3.802 
3. 4169 5.709 4416 5.666 4.123 
4. 4.928 5.608 4.940 5.291 4.674 
1. 4679 2.870 4.648 3.847 
2. 4844 3.578 5.393 4.374 
3. 5.059 4403 4.496 4.688 
4. 5.038 4.905 5.208 4.806 


The accompanying data on cortisol level was reported in 
the article “‘Cortisol, Cortisone, and 11-Deoxycortisol 
Levels in Human Umbilical and Maternal Plasma in 
Relation to the Onset of Labor” (J. of Obstetric 
Gynaecology of the British Commonwealth, 1974: 
737-745). Experimental subjects were pregnant women 
whose babies were delivered between 38 and 42 weeks 
gestation. Group 1 individuals elected to deliver by 
Caesarean section before labor onset, group 2 delivered by 
emergency Caesarean during induced labor, and group 3 
individuals experienced spontaneous labor. Use the 
Kruskal-Wallis test at level .05 to test for equality of the 
three population means. 


Group1 262 307 211 323 454 = 339 
304. 154-287 356 

Group2 465 501 455 355 468 = 3362 

Group3 343 772 207 1048 838 687 


In a test to determine whether soil pretreated with small 
amounts of Basic-H makes the soil more permeable to 
water, soil samples were divided into blocks, and each 
block received each of the four treatments under study. 
The treatments were (A) water with .001% Basic-H 
flooded on control soil, (B) water without Basic-H on 
control soil, (C) water with Basic-H flooded on soil 
pretreated with Basic-H, and (D) water without Basic-H 
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on soil pretreated with Basic-H. Test at level .01 to see 
whether there are any effects due to the different 


treatments. 
Blocks 

1 2 3 4 5 
A 37.1 31.8 28.0 25.9 255 
B 33.2 25.3 20.2 20.3 18.3 
C 58.9 54.2 49.2 47.9 38.2 
D 56.7 49.6 46.4 40.9 39.4 

6 7 8 9 10 
A 25.3 23.7 24.4 21.7 26.2 
B 19.3 17.3 17.0 16.7 18.3 
GC 48.8 47.8 40.2 44.0 46.4 
D 37.1 37.5 39.6 35.1 36.5 


27. In an experiment to study the way in which different anes- 
thetics affect plasma epinephrine concentration, ten dogs 
were selected and concentration was measured while they 
were under the influence of the anesthetics isoflurane, halo- 
thane, and cyclopropane (‘“Sympathoadrenal and 
Hemodynamic Effects of Isoflurane, Halothane, and 
Cyclopropane in Dogs,” Anesthesiology, 1974: 465-470). 
Test at level .05 to see whether there is an anesthetic effect 
on concentration. 


Dog 

1 2 3 4 5 
Isoflurane 28 ol 1.00 39 29 
Halothane 30 39 63 38 21 
Cyclopropane 1.07 = 1.35 .69 28 1.24 

6 7 8 9 10 
Isoflurane 36 32 .69 17 33 
Halothane 88 39 1 32 42 
Cyclopropane — 1.53 49 56 1.02 30 


SUPPLEMENTARY EXERCISES (28-36) 


28. 


The article “Effects of a Rice-Rich Versus Potato-Rich 
Diet on Glucose, Lipoprotein, and Cholesterol 
Metabolism in Noninsulin-Dependent Diabetics” (Amer. 
J. of Clinical Nutr., 1984: 598-606) gives the accompany- 
ing data on cholesterol-synthesis rate for eight diabetic 
subjects. Subjects were fed a standardized diet with potato 
or rice as the major carbohydrate source. Participants 
received both diets for specified periods of time, with 
cholesterol-synthesis rate (mmol/day) measured at the end 
of each dietary period. The analysis presented in this article 
used a distribution-free test. Use such a test with 


significance level .05 to determine whether the true mean 
cholesterol-synthesis rate differs significantly for the two 
sources of carbohydrates. 


Cholesterol-Synthesis Rate 


Subject 1 2 3 4 #5 6 7 #8 


Potato 
Rice 


1.88 2.60 
1.70 3.84 


1.38 4.41 1.87 2.89 3.96 2.31 
1.13 4.97 .86 1.93 3.36 2.15 
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29. 


CHAPTER 15 Distribution-Free Procedures 


High-pressure sales tactics or door-to-door salespeople can 
be quite offensive. Many people succumb to such tactics, 
sign a purchase agreement, and later regret their actions. In 
the mid-1970s, the Federal Trade Commission imple- 
mented regulations clarifying and extending the rights of 
purchasers to cancel such agreements. The accompanying 
data is a subset of that given in the article ‘Evaluating the 
FTC Cooling-Off Rule” (J. of Consumer Affairs, 1977: 
101-106). Individual observations are cancellation rates 
for each of nine salespeople during each of 4 years. Use an 
appropriate test at level .05 to see whether true average 
cancellation rate depends on the year. 


Salesperson 
1 2 3 4 5 6 7 8 9 


1973 
1974 
1975 
1976 


30. 


Treatment 


31. 


32. 


Lateral 86 
Diagonal 


33. 


2.8 5.9 3.3 
36 17 S11 22 21 41 47 2.7 13 
1.4 9 11 32 8 15 28 %14 #5 
2.0 2.2 9 Ll 5 12 14 3.5 1.2 


44 17 38 66 3.1 0.0 


The given data on phosphorus concentration in topsoil for 
four different soil treatments appeared in the article 
“Fertilisers for Lotus and Clover Establishment on a 
Sequence of Acid Soils on the East Otago Uplands” (N. 
Zeal. J. of Exptl. Ag., 1984: 119-129). Use a distribution- 
free procedure to test the null hypothesis of no difference 
in true mean phosphorus concentration (mg/g) for the four 
soil treatments. 


I 8.1 
Ul 11.5 


5.9 
10.9 


7.0 8.0 
12.1 10.3 
il 15.3 17.4 16.4 15.8 
IV 23.0 33.0 28.4 24.6 


Refer to the data of Exercise 30 and compute a 95% CI 
for the difference between true average concentrations 
for treatments IT and III. 


9.0 
11.9 
16.0 
27.7 


The study reported in “Gait Patterns During Free 
Choice Ladder Ascents” (Human Movement Sci., 
1983: 187-195) was motivated by publicity concern- 
ing the increased accident rate for individuals climbing 
ladders. A number of different gait patterns were used 
by subjects climbing a portable straight ladder accord- 
ing to specified instructions. The ascent times for 
seven subjects who used a lateral gait and six subjects 
who used a four-beat diagonal gait are given. 


1.31 
1.82 


164 1.51 
1.66 85, 


1.53 
1.45 


1.39 
1.24 


1.09 
1.27 


a. Carry out a test using a = .05 to see whether the data 
suggests any difference in the true average ascent 
times for the two gaits. 

b. Compute a 95% CI for the difference between the 
true average gait times. 


The sign test is a very simple procedure for testing 
hypotheses about a population median assuming only 
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34. 


35. 


that the underlying distribution is continuous. To illus- 
trate, consider the following sample of 20 observations 
on component lifetime (hr): 


1.7 3.3 5. 6.9 12.6 144 164 
24.6 260 265 32.1 37.4 40.1 40.5 
41.5 724 80.1 864 87.5 100.2 


We wish to test Hy: & = 25.0 versus H,: ft > 25.0. The 
test statistic is Y= the number of observations that 
exceed 25. 

a. Determine the P-value of the test when Y = 15. 
(Hint: Think of a “success” as a lifetime that exceeds 
25.0. Then Y is the number of successes in the sam- 
ple. What kind of a distribution does Y have when 
pL = 25.07] 

b. For the given data, should Hp be rejected at signifi- 
cance level .05? 


[Note: The test statistic is the number of differences 
X, — 25 that have positive signs, hence the name sign 
test. | 


Refer to Exercise 33, and consider a confidence inter- 

val associated with the sign test: the sign interval. 

The relevant hypotheses are now Hy: # = fy versus 

H,: fh F Lo. 

a. Suppose we decide to reject Hp if either Y = 15 or 
Y = 5. What is the smallest a for which this equiva- 
lent to rejecting H, if P-value = a? 

b. The confidence interval will consist of all values 2, 
for which H) is not rejected. Determine the CI for the 
given data, and state the confidence level. 


Suppose we wish to test. 
Hi: the X and Y distributions are identical 
versus 


H,: the X distribution is less spread out than the Y 
distribution 


The accompanying figure pictures X and Y distributions 
for which H, is true. The Wilcoxon rank-sum test is not 
appropriate in this situation because when H, is true as 
pictured, the Y’s will tend to be at the extreme ends of the 
combined sample (resulting in small and large Y ranks), 
so the sum of X ranks will result in a W value that is nei- 
ther large nor small. 


4: distribution 
a a distribution 


“Ranks”: 1 3 S++ 6 4 2 


Consider modifying the procedure for assigning ranks as 
follows: After the combined sample of m + n observations 
is ordered, the smallest observation is given rank 1, the 
largest observation is given rank 2, the second smallest is 


given rank 3, the second largest is given rank 4, and so on. 
Then if H, is true as pictured, the X values will tend to be in 
the middle of the sample and thus receive large ranks. Let 
W’ denote the sum of the X ranks and consider an upper- 
tailed test based on this test statistic. When Hp is true, every 
possible set of X ranks has the same probability, so W’ 
has the same distribution as does W when Hy is true. The 
accompanying data refers to medial muscle thickness for 
arterioles from the lungs of children who died from sudden 
infant death syndrome (x’s) and a control group of children 
(y’s). Carry out the test of Hy versus H, at level .05. 


SIDS 4.0 44 48 4.9 
Control = 3.7 4.1 43 5.1 5.6 
Consult the Lehmann book (in the chapter bibliography) 


for more information on this test, called the Siegel-Tukey 
test. 
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36. The ranking procedure described in Exercise 35 is some- 


what asymmetric, because the smallest observation 
receives rank 1, whereas the largest receives rank 2, and 
so on. Suppose both the smallest and the largest receive 
rank 1, the second smallest and second largest receive 
rank 2, and so on, and let W” be the sum of the X ranks. 
The null distribution of W” is not identical to the null 
distribution of W, so different tables are needed. Consider 
the case m = 3,n = 4. List all 35 possible orderings of 
the three X values among the seven observations (e.g., 1, 
3, 7 or 4, 5, 6), assign ranks in the manner described, 
compute the value of W” for each possibility, and then 
tabulate the null distribution of W”. What is the P-value if 
w” = 9? This is the Ansari-Bradley test; for additional 
information, see the book by Hollander and Wolfe in the 
chapter bibliography. 
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Quality Control Methods 


INTRODUCTION 


Quality characteristics of manufactured products have received much atten- 
tion from design engineers and production personnel as well as from those 
concerned with financial management. An article of faith over the years was 
that very high quality levels and economic well-being were incompatible goals. 
Recently, however, it has become increasingly apparent that raising quality lev- 
els can lead to decreased costs, a greater degree of consumer satisfaction, and 
thus increased profitability. This has resulted in renewed emphasis on statistical 
techniques for designing quality into products and for identifying quality prob- 
lems at various stages of production and distribution. 

Control charting is now used extensively as a diagnostic technique for 
monitoring production and service processes to identify instability and unusual 
circumstances. After an introduction to basic ideas in Section 16.1, a number 
of different control charts are presented in the next four sections. The basis for 
most of these lies in our previous work concerning probability distributions of 
various Statistics such as the sample mean X and sample proportion 6 = X/n. 

Another commonly encountered situation in industrial settings involves a 
decision by a customer as to whether a batch of items offered by a supplier is 
of acceptable quality. In the last section of the chapter, we briefly survey some 
acceptance sampling methods for deciding, based on sample data, on the 
disposition of a batch. 

Besides control charts and acceptance sampling plans, which were first 
developed in the 1920s and 1930s, statisticians and engineers have recently 
introduced many new statistical methods for identifying types and levels of pro- 
duction inputs that will ensure high-quality output. Japanese investigators, and 
in particular the engineer/statistician G. Taguchi and his disciples, have been 
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very influential in this respect, and there is now a large body of material known 
as “Taguchi methods.” The ideas of experimental design, and in particular frac- 
tional factorial experiments, are key ingredients. There is still much controversy 
in the statistical community as to which designs and methods of analysis are 
best suited to the task at hand. The expository article by George Box et al., cited 
in the chapter bibliography, gives an informative critique; the book by Thomas 
Ryan listed there is also a good source of information. 


16.1 General Comments on Control Charts 


A central message throughout this book has been the pervasiveness of naturally 
occurring variation associated with any characteristic or attribute of different indi- 
viduals or objects. In a manufacturing context, no matter how carefully machines 
are calibrated, environmental factors are controlled, materials and other inputs are 
monitored, and workers are trained, diameter will vary from bolt to bolt, some 
plastic sheets will be stronger than others, some circuit boards will be defective 
whereas others are not, and so on. We might think of such natural random variation 
as uncontrollable background noise. 

There are, however, other sources of variation that may have a pernicious 
impact on the quality of items produced by some process. Such variation may be 
attributable to contaminated material, incorrect machine settings, unusual tool wear, 
and the like. These sources of variation have been termed assignable causes in 
the quality control literature. Control charts provide a mechanism for recogniz- 
ing situations where assignable causes may be adversely affecting product quality. 
Once a chart indicates an out-of-control situation, an investigation can be launched 
to identify causes and take corrective action. 

A basic element of control charting is that samples have been selected from the 
process of interest at a sequence of time points. Depending on the aspect of the process 
under investigation, some statistic, such as the sample mean or sample proportion of 
defective items, is chosen. The value of this statistic is then calculated for each sample 
in turn. A traditional control chart then results from plotting these calculated values over 
time, as illustrated in Figure 16.1. 


Value of quality 4 
statistic 
UCL = Upper control limit 


e 
«~~ Center 


js line 


LCL = Lower control limit 


T T T T T > Time 


Figure 16.1 A prototypical control chart 
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Notice that in addition to the plotted points themselves, the chart has a center 
line and two control limits. The basis for the choice of a center line is sometimes 
a target value or design specification, for example, a desired value of the bearing 
diameter. In other cases, the height of the center line is estimated from the data. If the 
points on the chart all lie between the two control limits, the process is deemed to be 
in control. That is, the process is believed to be operating in a stable fashion reflect- 
ing only natural random variation. An out-of-control “signal” occurs whenever a 
plotted point falls outside the limits. This is assumed to be attributable to some 
assignable cause, and a search for such causes commences. The limits are designed 
so that an in-control process generates very few false alarms, whereas a process not 
in control quickly gives rise to a point outside the limits. 

There is a strong analogy between the logic of control charting and our previous 
work in hypothesis testing. The null hypothesis here is that the process is in control. 
When an in-control process yields a point outside the control limits (an out-of-control 
signal), a type I error has occurred. On the other hand, a type II error results when an 
out-of-control process produces a point inside the control limits. Appropriate choice of 
sample size and control limits will make the associated error probabilities suitably small. 

We emphasize that “in control” is not synonymous with “meets design specifi- 
cations or tolerance.” The extent of natural variation may be such that the percentage 
of items not conforming to specification is much higher than can be tolerated. In 
such cases, a major restructuring of the process will be necessary to improve pro- 
cess capability. An in-control process is simply one whose behavior with respect to 
variation is stable over time, showing no indications of unusual extraneous causes. 

Software for control charting is now widely available. The journal Quality 
Progress contains many advertisements for statistical quality control computer 
packages. In addition, SAS and Minitab, among other general-purpose packages, 
have attractive quality control capabilities. 


EXERCISES Section 16.1 (1-5) 


1. A control chart for thickness of rolled-steel sheets is based 
on an upper control limit of .0520 in. and a lower limit of 
.0475 in. The first ten values of the quality statistic (in this 4. 
case X, the sample mean thickness of n = 5 sample sheets) 
are .0506, .0493, .0502, .0501, .0512, .0498, .0485, .0500, 
.0505, and .0483. Construct the initial part of the quality 
control chart, and comment on its appearance. 


points plotted for which the probability of observing at least 
one outside the control limits exceeds .10? 


A cork intended for use in a wine bottle is considered 
acceptable if its diameter is between 2.9 cm and 3.1 cm 
(so the lower specification limit is LSL = 2.9 and the 
upper specification limit is USL = 3.1). 

a. If cork diameter is a normally distributed variable 
2. Refer to Exercise 1 and suppose the ten most recent val- with mean value 3.04 cm and standard deviation 
ues of the quality statistic are .0493, .0485, .0490, .0503, .02 cm, what is the probability that a randomly 
.0492, .0486, .0495, .0494, .0493, and .0488. Construct selected cork will conform to specification? 

the relevant portion of the corresponding control chart, b. If instead the mean value is 3.00 and the standard 
and comment on its appearance. deviation is .05, is the probability of conforming to 


3. Suppose a control chart is constructed so that the probabil- specification smaller or larger than it was in (a)? 


ity of a point falling outside the control limits when the 
process is actually in control is .002. What is the probability 
that ten successive points (based on independently selected 
samples) will be within the control limits? What is the prob- 
ability that 25 successive points will all lie within the con- 
trol limits? What is the smallest number of successive 
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If a process variable is normally distributed, in the long 

run virtually all observed values should be between 

pw — 30 and p + 3o, giving a process spread of 60. 

a. With LSL and USL denoting the lower and upper 
specification limits, one commonly used process capa- 
bility index is C, = (USL —LSL)/6o. The value 
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C, = 1 indicates a process that is only marginally Cor 
capable of meeting specifications. Ideally, C, should 
exceed 1.33 (a “very good” process). Calculate the 
value of C,, for each of the cork production processes 
described in the previous exercise, and comment. 

b. The C, index described in (a) does not take into 
account process location. A capability measure that 
does involve the process mean is 


16.2 Control Charts for Process Location 


Suppose the quality characteristic of interest is associated with a variable whose 
observed values result from making measurements. For example, the characteristic 
might be resistance of electrical wire (ohms), internal diameter of molded rubber 
expansion joints (cm), or hardness of a certain alloy (Brinell units). One important 
use of control charts is to see whether some measure of location of the variable’s 
distribution remains stable over time. The most popular chart for this purpose is 
the X chart. 


= min {(USL — p)/3oa, (u — LSL)/30} 


Calculate the value of C,, for each of the cork- 
production processes described in the previous exer- 
cise, and comment. [Note: In practice, 4 and a have 
to be estimated from process data; we show how to 
do this in Section 16.2] 

c. How do C, and C,, compare, and when are they equal? 


The X Chart Based on Known Parameter Values 


Because there is uncertainty about the value of the variable for any particular item 
or specimen, we denote such a random variable (rv) by X. Assume that for an in- 
control process, X has a normal distribution with mean value yw and standard devia- 
tion o. Then if X denotes the sample mean for a random sample of size n selected 
at a particular time point, we know that 


Le Sp 
2. og =0/Vn 


3. X has a normal distribution. 
It follows that 
P(e — 307 =X = p + 30) = P(—-3.00 = Z = 3.00) = .9974 


where Z is a standard normal rv.* It is thus highly likely that for an in-control pro- 
cess, the sample mean will fall within 3 standard deviations (30x) of the process 
mean LL. 

Consider first the case in which the values of both w and o are known. 
Suppose that at each of the time points 1, 2, 3,..., a random sample of size n is 
available. Let x,, x», X3,...denote the calculated values of the corresponding sam- 
ple means. An X chart results from plotting these x,’s over time—that is, plotting 
points (1, x,), (2, x,), (3, x3), and so on—and then drawing horizontal lines across 
the plot at 


LCL = lower control limit = w — 3 - 


UCL = upper control limit = w+ 3 - 


Sih 


* The use of charts based on 3 SD limits is traditional, but tradition is certainly not inviolable. 
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Such a plot is often called a 3-sigma chart. Any point outside the control limits 
suggests that the process may have been out of control at that time, so a search for 
assignable causes should be initiated. 


EXAMPLE 16.1 Once each day, three specimens of motor oil are randomly selected from the produc- 
tion process, and each is analyzed to determine viscosity. The accompanying data 
(Table 16.1) is for a 25-day period. Extensive experience with this process suggests 
that when the process is in control, viscosity of a specimen is normally distributed 
with mean 10.5 and standard deviation .18. Thus oy = a/Vn = .18/V3 = .104, so 
the 3 SD control limits are 


LCL=p-3- = 10.5 — 3(.104) = 10.188 


Vn 


Oo 
UCL = w+ 3+ —= = 10.5 + 3(.104) = 10.812 
Vn 


n 


Table 16.1 Viscosity Data for Example 16.1 


Day Viscosity Observations x s Range 
i 10.37 10.19 10.36 10.307 101 18 
2 10.48 10.24 10.58 10.433 175 34 
3 10.77 10.22 10.54 10.510 .276 9 
4 10.47 10.26 10.31 10.347 110 PPA | 
5 10.84 10.75 10.53 10.707 159 31 
6 10.48 10.53 10.50 10.503 .025 .05 
7 10.41 10.52 10.46 10.463 055 11 
8 10.40 10.38 10.69 10.490 173 31 
9 10.33 10.35 10.49 10.390 .087 16 

10 10.73 10.45 10.30 10.493 218 43 
11 10.41 10.68 10.25 10.447 aly 43 
12 10.00 10.60 10.71 10.437 382 <l 
13 10.37 10.50 10.34 10.403 085 .16 
14 10.47 10.60 10.75 10.607 .140 28 
15 10.46 10.46 10.56 10.493 058 10 
16 10.44 10.68 10.32 10.480 183 36 
17 10.65 10.42 10.26 10.443 196 39 
18 10.73 10.72 10.83 10.760 .061 All 
19 10.39 10.75 10.27 10.470 .250 48 
20 10.59 10.23 10.35 10.390 183 36 
21 10.47 10.67 10.64 10.593 108 .20 
22 10.40 10.55 10.38 10.443 093 A7 
23 10.24 10.71 10.27 10.407 263 47 
24 10.37 10.69 10.40 10.487 ATT 32 
25 10.46 10.35 10.37 10.393 O59 11 


All points on the control chart shown in Figure 16.2 are between the control limits, 
indicating stable behavior of the process mean over this time period (the standard 
deviation and range for each sample will be used in the next subsection). 
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10.85 UCL 


10.15 


Too =“ Time 


5 10 15 20 25 


Figure 16.2 X chart for the viscosity data of Example 16.1 a 


X Charts Based on Estimated Parameters 


In practice it frequently happens that values of 4 and o are unknown, so they must 
be estimated from sample data prior to determining the control limits. This is espe- 
cially true when a process is first subjected to a quality control analysis. Denote 
the number of observations in each sample by n, and let k represent the number of 
samples available. Typical values of n are 3, 4, 5, or 6; it is recommended that k be at 
least 20. We assume that the k samples were gathered during a period when the pro- 
cess was believed to be in control. More will be said about this assumption shortly. 

With x,, X5,..., x, denoting the k calculated sample means, the usual estimate 
of pz is simply the average of these means: 


There are two different commonly used methods for estimating o: one based on the 
k sample standard deviations and the other on the & sample ranges (recall that the 
sample range is the difference between the largest and smallest sample observa- 
tions). Prior to the wide availability of good calculators and statistical computer 
software, ease of hand calculation was of paramount consideration, so the range 
method predominated. However, in the case of a normal population distribution, the 
unbiased estimator of o based on S is known to have smaller variance than that based 
on the sample range. Statisticians say that the former estimator is more efficient than 
the latter. The loss in efficiency for the estimator is slight when n is very small but 
becomes important forn > 4. 

Recall that the sample standard deviation is not an unbiased estimator for o. 
When X),..., X,, iS arandom sample from a normal distribution, it can be shown (cf. 
Exercise 6.37) that 


where 


a= V 20 (n/2) 
"Wn = 1P[(n - 1)/2] 
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and I(-) denotes the gamma function (see Section 4.4). A tabulation of a, for 
selected n follows: 


S= 


k 
where S,, S,,..., 5, are the sample standard deviations for the k samples. Then 


4 1 ‘ 1X 1 
B(S) = +#(3s)- EES) = | ie = 4, 6 


Thus 


od oe 7 
E =—E(S)=—-a,:o=0 
a,} a a 


n n 


so 6 = S/a, is an unbiased estimator of o. 


Control Limits Based on the Sample Standard Deviations 


where 


EXAMPLE 16.2 Referring to the viscosity data of Example 16.1, we had n = 3 and k = 25. The 
values of x, and s,(i = 1,..., 25) appear in Table 16.1, from which it follows that 
x = 261.896/25 = 10.476 and 5 = 3.834/25 = .153. With a, = .886, we have 


Al 
LCL = 10.476 —3- a 10.476 — .299 = 10.177 
886V3 
153 
UCL = 10.476 + 3 - ———— = 10.476 + .299 = 10.775 
886V3 


These limits differ a bit from previous limits based on w = 10.5 and o = .18 because 
now ft = 10.476 and G = s/a, = .173. Inspection of Table 16.1 shows that every x; 
is between these new limits, so again no out-of-control situation is evident. o 


To obtain an estimate of o based on the sample range, note that if X,,..., X,, 
form a random sample from a normal distribution, then 
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R = range(X),..., X,) = max(X,,...,X,,) — min(X,..., X,) 


? n 


= max(X, — p,...,X, — wm) — min(X, — p,...,X, — w) 


_ | (aS =a in( 1)! 
= 0) max fae min uy 
Oo oO Oo Oo 


=o {max(Z,,...,Z,) — min(Z,,..., Z,)} 


where Z,,..., Z, are independent standard normal rv’s. Thus 
E(R) = o - E(range of a standard normal sample) 
=a°'b, 
so that R/b, is an unbiased estimator of o. 


Now denote the ranges for the k samples in the quality control data set by 
I}, ,---, 4. The argument just given implies that the estimate 


1 k 


a= = 


comes from an unbiased estimator for o. Selected values of b, appear in the accom- 
panying table [their computation is based on using statistical theory and numerical 
integration to determine E(min(Z,,..., Z,,)) and E(max(Z,,..., Z,))]. 

n | 3 4 5 6 7 8 

| 1.693 2.058 2.325 2.536 2.706 2.844 


Control Limits Based on the Sample Ranges 


rT 


LCL=x-3- 


UCL=x+3- 


where r = *_,r,/k and r,,..., r, are the k individual sample ranges. 


EXAMPLE 16.3 Table 16.1 yields 7 = .292, so @ = .292/b, = .292/1.693 = .172 and 
(Example 16.2 292 


continued) LCL = 10.476 — 3 - — = 10.476 — .299 = 10.177 
1.6933 
292 
UCL = 10.476 + 3 - = 10.476 + .299 = 10.775 
1.693V3 
These limits are identical to those based on s, and again every x; lies between the 
limits. a 


Recomputing Control Limits 


We have assumed that the sample data used for estimating jz and o was obtained from 
an in-control process. Suppose, though, that one of the points on the resulting control 
chart falls outside the control limits. Then if an assignable cause for this out-of-control 
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situation can be found and verified, it is recommended that new control limits be cal- 
culated after deleting the corresponding sample from the data set. Similarly, if more 
than one point falls outside the original limits, new limits should be determined after 
eliminating any such point for which an assignable cause can be identified and dealt 
with. It may even happen that one or more points fall outside the new limits, in which 
case the deletion/recomputation process must be repeated. 


Performance Characteristics of Control Charts 


Generally speaking, a control chart will be effective if it gives very few out-of- 
control signals when the process is in control, but shows a point outside the control 
limits almost as soon as the process goes out of control. One assessment of a chart’s 
effectiveness is based on the notion of “error probabilities.” Suppose the variable 
of interest is normally distributed with known o (the same value for an in-control 
or out-of-control process). In addition, consider a 3-sigma chart based on the target 
value fy, with 2 = fy when the process is in control. One error probability is 


a = P(a single sample gives a point outside the control limits when ww = [1y) 


= P(X > py + 30/Vn or X < py — 30/Vn when p = py) 
X- X- 
- o| E58 Gi M® <3 when w= Ho) 
a/Vn a/Vn 
The standardized variable Z = (X — by) /(a/ Vn) has a standard normal distribution 
when pu = [o, SO 


a = P(Z>30rZ < —3) = ®(—3.00) + 1 — ®(3.00) = .0026 


If 3.09 rather than 3 had been used to determine the control limits (this is customary 
in Great Britain), then 


a = P(Z > 3.09 or Z < —3.09) = .0020 


The use of 3-sigma limits makes it highly unlikely that an out-of-control signal will 
result from an in-control process. 

Now suppose the process goes out of control because pw has shifted to w + Ao 
(A might be positive or negative); A is the number of standard deviations by which 
pw has changed. A second error probability is 


B=P a single sample gives a point inside 
the control limits when = uy + Ao 


= P(uy) — 30/Vn < X < py + 30/Vn when pw = pw, + Ac) 


We now standardize by first subtracting 4, + Ao from each term inside the paren- 
theses and then dividing by o/Vn: 


B = P(-3 — VnA < standard normal rv < 3 — VnA) 
= ©(3 — Vind) — ®(-3 — Vind) 


This error probability depends on A, which determines the size of the shift, and on 
the sample size n. In particular, for fixed A, B will decrease as n increases (the larger 
the sample size, the more likely it is that an out-of-control signal will result), and for 
fixed n, B decreases as | A] increases (the larger the magnitude of a shift, the more 
likely it is that an out-of-control signal will result). The accompanying table gives 6 
for selected values of A when n = 4. 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


16.2 Control Charts for Process Location 687 


A .25 50 We) 1.00 1.50 2.00 2.50 3.00 
Bwhenn = 4 9936 =.9772 ~—-.9332 8413 5000 1587 = .0668 ~—.0013 


It is clear that a small shift is quite likely to go undetected in a single sample. 

If 3 is replaced by 3.09 in the control limits, then a decreases from .0026 to 
.002, but for any fixed n and o, B will increase. This is just a manifestation of the 
inverse relationship between the two types of error probabilities in hypothesis test- 
ing. For example, changing 3 to 2.5 will increase a and decrease B. 

The error probabilities discussed thus far are computed under the assumption 
that the variable of interest is normally distributed. If the distribution is only slightly 
nonnormal, the Central Limit Theorem effect implies that X will have approximately 
a normal distribution even when n is small, in which case the stated error probabili- 
ties will be approximately correct. This is, of course, no longer the case when the 
variable’s distribution deviates considerably from normality. 

A second performance assessment involves expected or average run length 
needed to observe an out-of-control signal. When the process is in control, we 
should expect to observe many samples before seeing one whose x lies outside the 
control limits. On the other hand, if a process goes out of control, the expected num- 
ber of samples necessary to detect this should be small. 

Let p denote the probability that a single sample yields an x value outside the 
control limits; that is, 


p=P(X<p)—30/Vn or X>p,) + 30/Vn) 


Consider first an in-control process, so that ee x Xe, ... are all normally distributed 
with mean value j1, and standard deviation o/ Vn. Define an rv Y by 


Y = the first i for which X, falls outside the control limits 


If we think of each sample number as a trial and an out-of-control sample as a 
success, then Y is the number of (independent) trials necessary to observe a suc- 
cess. This Y has a geometric distribution, and we showed in Example 3.18 that 
E(Y) = 1/p. The acronym ARL (for average run length) is often used in place of 
E(Y). Because p = a for an in-control process, we have 


ARL = E(Y) : : — 384.62 
Dp a .0026 
Replacing 3 in the control limits by 3.09 gives ARL = 1/.002 = 500. 

Now suppose that, at a particular time point, the process mean shifts to 
b= p, + Ao. If we define Y to be the first i subsequent to the shift for which a sam- 
ple generates an out-of-control signal, it is again true that ARL = E(Y) = 1/p, but 
now p = | — B. The accompanying table gives selected ARLs for a 3-sigma chart 
when n = 4. These results again show the chart’s effectiveness in detecting large 
shifts but also its inability to quickly identify small shifts. When sampling is done 
rather infrequently, a great many items are likely to be produced before a small shift 
in p is detected. The CUSUM procedures discussed in Section 16.5 were developed 
to address this deficiency. 


A 25 50 75 1.00 150 2.00 2.50 3.00 
ARL whenn =4 | 156.25 43.86 14.97 6.30 2.00 119 1.07 ~~ 1.0013 
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Supplemental Rules for X Charts 


The inability of X charts with 3-sigma limits to quickly detect small shifts in the pro- 
cess mean has prompted investigators to develop procedures that provide improved 
behavior in this respect. One approach involves introducing additional conditions 
that cause an out-of-control signal to be generated. The following conditions were 
recommended by Western Electric (then a subsidiary of AT&T). An intervention to 
take corrective action is appropriate whenever one of these conditions is satisfied: 


1. Two out of three successive points fall outside 2-sigma limits on the same side 
of the center line. 


2. Four out of five successive points fall outside 1-sigma limits on the same side 
of the center line. 


3. Eight successive points fall on the same side of the center line. 


A quality control text should be consulted for a discussion of these and other sup- 
plemental rules. 


Robust Control Charts 


The presence of outliers in the sample data tends to reduce the sensitivity of control- 
charting procedures when parameters must be estimated. This is because the control 
limits are moved outward from the center line, making the identification of unusual 
points more difficult. We do not want the statistic whose values are plotted to be 
resistant to outliers, because that would mask any out-of-control signal. For exam- 
ple, plotting sample medians would be less effective than plotting x,,x,,... as is 
done on an X chart. 

The article “Robust Control Charts” by David M. Rocke (Technometrics, 
1989: 173-184) presents a study of procedures for which control limits are based on 
Statistics resistant to the effects of outliers. Rocke recommends control limits calcu- 
lated from the interquartile range (IQR), which is very similar to the fourth spread 
introduced in Chapter 1. In particular, 


IOR = (2nd largest x,) — (2nd smallest x,) n= 4,5, 6,7 
(3rd largest x,) — (3rd smallest x;) = 8,9, 10, 11 


For a random sample from a normal distribution, EIQR) = k,a; the values of k, are 
given in the accompanying table. 


k, 


n 4 5 6 7 8 
a 596 .990 1.282 1.512 942 


The suggested control limits are 


IQR = IQR 
Q UCL=x+3- R 


LCL=x-3- 
k,Vn k,Vn 


The values of X,, Xx, x3,... are plotted. Simulations reported in the article indicated 
that the performance of the chart with these limits is superior to that of the traditional 
X chart. 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


EXERCISES Section 16.2 (6-15) 


16.2 Control Charts for Process Location 689 


10. 


In the case of known p and o, what control limits are 
necessary for the probability of a single point being out- 
side the limits for an in-control process to be .005? 


Consider a 3-sigma control chart with a center line at pry 
and based on n = 5. Assuming normality, calculate the 
probability that a single point will fall outside the control 
limits when the actual process mean is 


a Uy + 50 
b. py —o 
Cc. Uy + 20 


The table below gives data on moisture content for 
specimens of a certain type of fabric. Determine control 
limits for a chart with center line at height 13.00 based 
on a = .600, construct the control chart, and comment 
on its appearance. 


Refer to the data given in Exercise 8, and construct a 
control chart with an estimated center line and limits 
based on using the sample standard deviations to estimate 
o. Is there any evidence that the process is out of control? 


When installing a bath faucet, it is important to properly 
fasten the threaded end of the faucet stem to the water- 
supply line. The threaded stem dimensions must meet 


Data for Exercise 8 


product specifications, otherwise malfunction and leak- 
age may occur. Authors of “Improving the Process 
Capability of a Boring Operation by the Application 
of Statistical Techniques” (Intl. J. Sci. Engr. Research, 
Vol. 3, Issue 5, May 2012) investigated the production 
process of a particular bath faucet manufactured in India. 
The article reported the threaded stem diameter (target 
value being 13 mm) of each faucet in 25 samples of size 
4 as shown here: 


Subgroup xy x, x3 X4 

1 13.02 12.95 12,92 12.99 
2 13.02 13.10 12.96 12.96 
3 13.04 13.08 13.05 13.10 
4 13.04 12.96 12.96 12.97 
5 12.96 12:97 12.90 13.05 
6 12.90 12.88 13.00 13.05 
7 12.97 12.96 12.96 12.99 
8 13.04 13.02 13.05 12.97 
9 13.05 13.10 12.98 12.96 
10 12.96 13.00 12.96 12.99 
11 12.90 13.05 12.98 12.88 
12 12.96 12.98 12.97 13.02 
(continued ) 


Sample No. Moisture-Content Observations x s Range 
1 12.2 12.1 1333 13.0 13.0 12.72 536 i 
2, 12.4 13.3 12.8 12.6 12.9 12.80 339 29 
3 12.9 12.7 14.2 12.5 12.9 13.04 .669 7 
4 13.2 13.0 13.0 12.6 13.9 13.14 477 3 
5 12.8 12.3 12.2 13.3 12.0 12.52 526 1.3 
6 13.9 13.4 13.1 12.4 13.2 13.20 543 £5 
7 122 14.4 12.4 12.4 125 12.78 912 2.2 
8 12.6 12.8 13.5 13.9 13.1 13.18 526 1.3 
9 14.6 13.4 12,2 13-7 12.5 13.28 .963 2.4 

10 12.8 (2.3 12.6 13.2 12.8 12.74 329 9 
11 12.6 13.1 12:7 13:2, 12.3 12.78 370 2 
12 13.5 12:3 12.8 13.1 12.9 12.92 438 1,2 
13 13.4 13.3 12.0 12.9 13.1 12.94 559 1.4 
14 13.5 12.4 13.0 13.6 13.4 13.18 492 1:2, 
1: 12.3 12.8 13.0 12.8 13,5 12.88 432 1.2 
16 12.6 13.4 12,1 13.2 13°3 12.92 554 1.3 
17 12.1 12.7 13.4 13.0 13.9 13.02 .683 1.8 
18 13.0 12.8 13.0 13.3 13.1 13.04 182 A) 
19 12.4 13.2 13.0 14.0 13:1 13.14 573 1.6 
20 12.7 12.4 12.4 13:9 12.8 12.84 619 1.5 
21 12.6 12.8 12.7 13.4 13.0 12.90 316 8 
22 12.7 13.4 12.1 13.2 13.3 12.94 S41 1.3 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 


Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


690 CHAPTER 16 Quality Control Methods 


Subgroup xy x, x3 X4 
13 13.00 12.96 12.99 12.90 
14 12.88 12.94 13.05 13.00 
15 12.96 12.96 13.04 12.98 
16 12.99 12.94 13.00 13.05 
17 13.05 13.02 12.88 12.96 
18 13.08 13.06 13.10 13.05 
19 13.02 13.05 13.04 12.97 
20 12.96 12.90 12.97 13.05 
21 12.98 12.99 12.96 13.00 
22 12.97 13.02 12.96 12.99 
23 13.04 13.00 12.98 13.10 
24 13.02 12.90 13.05 12.97 
25 12.93 12.88 12.91 12.90 


Calculate control limits based on using the sample ranges 
to estimate o. Does the process appear to be in control? 


11. The accompanying table gives sample means and stan- 
dard deviations, each based on n = 6 observations of the 
refractive index of fiber-optic cable. Construct a control 
chart, and comment on its appearance. [Hint: 
Xx; = 2317.07 and Xs; = 30.34.] 

Day x s Day x s 

1 95.47 1.30 13 97.02 1.28 
2 97.38 88 14 95.55 1.14 
3 96.85 1.43 15 96.29 1.37 
4 96.64 1.59 16 96.80 1.40 
5 96.87 1.52 17 96.01 1.58 
6 96.52 1.27 18 95.39 98 
7 96.08 1.16 19 96.58 1.21 
8 96.48 79 20 96.43 lS 
9 96.63 1.48 21 97.06 1.34 

10 96.50 .80 22 98.34 1.60 

11 97.22 1.42 23 96.42 1.22 

12 96.55 1.65 24 95.99 1.18 

12. Refer to Exercise 11. An assignable cause was found for 
the unusually high sample average refractive index on 
day 22. Recompute control limits after deleting the data 
from this day. What do you conclude? 

13. Consider the control chart based on control limits 


My + 2.81 0/Vn. 


16.3 


14. 


15. 


a. What is the ARL when the process is in control? 

b. What is the ARL when n = 4 and the process mean 
has shifted to w = by) + o? 

c. How do the values of parts (a) and (b) compare to the 
corresponding values for a 3-sigma chart? 


Three-dimensional (3D) printing is a manufacturing tech- 
nology that allows the production of three-dimensional 
solid objects through a meticulous layering process per- 
formed by a 3D printer. 3D printing has rapidly become a 
time-saving and economical way to create a wide variety 
of products such as medical implants, furniture, tools, and 
even jewelry. The article ‘Process Capability Analysis of 
Cost Effective Rapid Casting Solution Based on Three 
Dimensional Printing” (MIT Intl. J. Mech. Engr., 2012: 
31-38) considered the production process of metal cast- 
ings by using a 3D printer. Data was collected on 16 batches 
(each having two castings), where the outer diameter of each 
casting (in mm) was recorded. The target diameter of each 
casting was 60 mm. The resulting data is given here: 


Batch xy xX, 
1 59.664 59.675 
2 59.661 59.648 
3 59.679 59.652 
4 59.665 59.654 
2) 59.667 59.678 
6 59.673 59.657 
7 59.676 59.661 
8 59.648 59.651 
9 59.681 59.675 
10 59.655 59.672 
11 59.691 59.676 
12 59.682 59.651 
13 59.651 59.682 
14 59.668 59.685 
15 59.691 59.682 
16 59.661 59.673 


Apply the supplemental rules suggested in the text to the 
data. Are there any out-of-control signals? 


Calculate control limits for the data of Exercise 8 using 
the robust procedure presented in this section. 


Control Charts for Process Variation 


The control charts discussed in the previous section were designed to control the 
location (equivalently, central tendency) of a process, with particular attention to the 
mean as a measure of location. It is equally important to ensure that a process is under 
control with respect to variation. In fact, most practitioners recommend that control 
be established on variation prior to constructing an X chart or any other chart for con- 
trolling location. In this section, we consider charts for variation based on the sample 
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standard deviation S' and also charts based on the sample range R. The former are 
generally preferred because the standard deviation gives a more efficient assessment 
of variation than does the range, but R charts were used first and tradition dies hard. 


The S Chart 


We again suppose that k independently selected samples are available, each one 
consisting of n observations on a normally distributed variable. Denote the sample 
standard deviations by s,, 5,,..., 5, with s = Xs,/k. The values s,, 55, 53,... are plot- 
ted in sequence on an S chart. The center line of the chart will be at height s, and 
the 3-sigma limits necessitate determining 30, (just as 3-sigma limits of an X chart 
required 30% = 30/Vn, with o then estimated from the data). 

Recall that for any rv Y, V(Y) = E(¥) — [E(Y)]?, and that a sample variance 
S? is an unbiased estimator of o?, that is, E(S?) = a7. Thus 


VS) = BS) ~ ES)P =o? = (a0)? =o'(1 —a)) 


nn 


where values of a, for n = 3,..., 8 are tabulated in the previous section. The stand- 
ard deviation of S is then 


05 = VV(S) = 0V1- @ 


It is natural to estimate o using s,,..., 5,, aS was done in the previous section namely, 
G= s/a,. Substituting G for o in the expression for o, gives the quantity used to 
calculate 3-sigma limits. 


The 3-sigma control limits for an S control chart are 


LCL = 5 — 3s\/1 — a?/a, 
UCL = 5 — 3s\/1 — a?/a, 


The expression for LCL will be negative ifn = 5, in which case it is customary 
to use LCL = 0. 


EXAMPLE 16.4 Table 16.2 displays observations on stress resistance of plastic sheets (the force, 
in psi, necessary to crack a sheet). There are k = 22 samples, obtained at equally 
spaced time points, and n = 4 observations in each sample. It is easily verified that 
Ys; = 51.10 and s = 2.32, so the center of the S chart will be at 2.32 (though because 
n = 4, LCL = 0 and the center line will not be midway between the control limits). 
From the previous section, a, = .921, from which the UCL is 


UCL = 2.32 + 3(2.32)(V1 — (.921)?)/.921 = 5.26 


Table 16.2 Stress-Resistance Data for Example 16.4 


Sample No. Observations SD Range 
1 29.7 29.0 28.8 30.2 64 1.4 
2 32.2 29.3 322, 32.9 1.60 3.6 
3 35.9 29.1 32.1 31.3 2.83 6.8 
4 28.8 De 28.5 35.7 3.83 8.5 
5 30.9 32.6 28.3 28.3 2.11 4.3 
(continued ) 
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Table 16.2 Stress-Resistance Data for Example 16.4 (continued) 


Sample No. Observations SD Range 
6 30.6 34.3 34.8 26.3 3.94 8.5 
7 32.3 27.7 30.9 27.8 2.30 4.6 
8 32.0 27.9 31.0 30.8 1.76 4.1 
9 24.2 27.5 28.5 31.1 2.85 6.9 

10 33.7 24.4 34.3 31.0 4.53 9.9 
11 35.3 33.2 31.4 28.0 3.09 73 
12 28.1 34.0 31.0 30.8 2.41 5.9 
13 28.7 28.9 25.8 29:4 Pal 3.9 
14 29.0 33.0 30.2 30.1 1.71 4.0 
15 33:9 32.6 33.6 29:2 2.07 4.4 
16 26.9 27.3 32.1 28.5 2.31 5.2 
17 30.4 29.6 31.0 33.8 1.83 4.2 
18 29.0 28.9 31.8 26.7 2.09 5.1 
19 33.8 30.9 31.7 28.2 2.32 5.6 
20 29.7 27.9 29.1 30.1 .96 2.2 
21 27.9 21.7 30.2 32.9 2.43 5.2 
22 30.0 31.4 2164 28.1 1.72 31 


The resulting control chart is shown in Figure 16.3. All plotted points are well within 
the control limits, suggesting stable process behavior with respect to variation. 
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Figure 16.3 S chart for stress-resistance data for Example 16.4 | 


The R Chart 


Let r,, r>,...7, denote the k sample ranges and r = =r,/k. The center line of an R 
chart will be at height r. Determination of the control limits requires o,, where R 
denotes the range (prior to making observations—as a random variable) of a random 
sample of size n from a normal distribution with mean value p and standard devia- 
tion o. Because 


R= max(X,,..., X,) — min(X),..., X,,) 
= o{max(Z,,...,Z,) — min(Z,,...,Z,,)} 
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where Z, = (X, — »)/o, and the Z,’s are standard normal rv’s, it follows that 


standard deviation of the range of random sample 
Orn - O° ; ee , 
‘ of size n from a standard normal distribution 
=O" Ch 


The values of c,, for n = 3,..., 8 appear in the accompanying table. 


c | 888 .880 864 848 833 .820 


It is customary to estimate o by @ = r/b, as discussed in the previous section. This 
gives Gp = c,r/b,, as the estimated standard deviation of R. 


The 3-sigma limits for an R chart are 
UC = = she 
UCL = r + 3c,r/b,, 


The expression for LCL will be negative if n <= 6, in which case LCL = 0 
should be used. 


EXAMPLE 16.5 In tissue engineering, cells are seeded onto a scaffold that then guides the growth 
of new cells. The article ‘On the Process Capability of the Solid Free-Form 
Fabrication: A Case Study of Scaffold Moulds for Tissue Engineering” (J. 
of Engr. in Med., 2008: 377-392) used various quality control methods to study 
a method of producing such scaffolds. An unusual feature is that instead of sub- 
groups being observed over time, each subgroup resulted from a different design 
dimension (ym). Table 16.3 contains data from Table 2 of the cited article on the 
deviation from target in the perpendicular orientation (these deviations are indeed all 
positive—the printed beams exhibit larger dimensions than those designed). 


Table 16.3 Deviation-from-Target Data for Example 16.5 


des dim mean range st dev 
200 12 17 6 1.7 11 5.51 
250 6 9 17 10.7 11 5.69 
300 5 9 15 9.7 10 5.03 
350 19 6 11 12.0 13 6.56 
400 9 14 9 10.7 5 2.89 
450 9 15 8 10.7 7 3.79 
500 8 11 12 10.3 4 2.08 
550 4 14 11 9.7 10 5.13 
600 11 14 7 10.7 7 3.51 
650 13 9 9 10.3 4 2.31 
700 10 14 8 10.7 6 3.06 
750 8 9 4 7.0 5 2.65 
800 14 7 9 10.0 7 3.61 
850 7 9 12 9.3 5 252 
900 14 5 8 9.0 9 4.58 
950 10 12 10 10.7 2 1.15 
1000 7 11 15 11.0 8 4.00 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


694 CHAPTER 16 Quality Control Methods 
Table 16.3 yields =r, = 124, from which 7 = 7.29. Since n = 3, LCL = 0. With 
b, = 1.693 and c, = .888, 
UCL = 7.29 + 3 - (.888)(7.29)/1.693 = 18.76 


Figure 16.4 shows both an R chart and an X chart from the Minitab software pack- 
age (the cited article also included these charts). All points are within the appropri- 
ate control limits, indicating an in-control process for both location and variation. 
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Figure 16.4 Control charts for the deviation-from-target data of Example 16.5 | 


Charts Based on Probability Limits 


Consider an X chart based on the in-control (target) value 4) and known a. When 
the variable of interest is normally distributed and the process is in control, 


P(X, > py + 30/Vn) = 0013 = P(X; < py — 30/Vn) 


That is, the probability that a point on the chart falls above the UCL is .0013, as is 
the probability that the point falls below the LCL (using 3.09 in place of 3 gives 
.001 for each probability). When control limits are based on estimates of w and o, 
these probabilities will be approximately correct provided that n is not too small and 
k is at least 20. 

By contrast, it is not the case for a 3-sigma S chart that P(S, > UCL) = 
P(S; < LCL) = .0013, nor is it true for a 3-sigma R chart that P(R; > UCL) = 
P(R; < LCL) = .0013. This is because neither S nor R has a normal distribution even 
when the population distribution is normal. Instead, both S and R have skewed dis- 
tributions. The best that can be said for 3-sigma S' and R charts is that an in-control 
process is quite unlikely to yield a point at any particular time that is outside the 
control limits. Some authors have advocated the use of control limits for which the 
“exceedance probability” for each limit is approximately .001. The book Statistical 
Methods for Quality Improvement (see the chapter bibliography) contains more 
information on this topic. 
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EXERCISES Section 16.3 (16-20) 


16. A manufacturer of dustless chalk instituted a quality control des dim observations 
program to monitor chalk density. The sample standard 400 15 2 36 
deviations of densities for 24 different subgroups, each 450 6 31 14 
consisting of n = 8 chalk specimens, were as follows: 500 13 24 9 
550 21 18 16 
204 315 096 184 .230 212 .322 .287 600 6 16 20 
145.211 053) 145) 272) 351) 159.214 650 8 17; 23 
388 187.150.229.276 118 .091 .056 700 3 26 17 
750 17 12 22 
Calculate limits for an S chart, construct the chart, and 800 41 17 3 
check for out-of-control points. If there is an out-of- ie e ‘es . 
control point, delete it and repeat the process. 950 25 4 17 
17. Subgroups of power supply units are selected once each 1000 8 23 15 
hour from an assembly line, and the high-voltage output 19. Calculate control limits for an S chart from the refractive 
of each unit is determined. index data of Exercise 11. Does the process appear to be 
a. Suppose the sum of the resulting sample ranges for in control with respect to variability? Why or why not? 
30 subgroups, each consisting of four units, is 85.2. , . 
Calc ineccansl lithic tor an Rouhat 20. When S? is the sample variance of a normal random 
° 2/2 F an ere ‘ 
b. Repeat part (a) if each subgroup consists of eight sample, (n — 1)S*/o* has a chi-squared distribution with 
units and the sum is 106.2. = ane 
18. The following data on the deviation from target in the paral- (n — 1)S? 


lel orientation is taken from Table 1 of the article cited in 
Example 16.5. Sometimes a transformation of the data is 


Plan < a < tes) = .998 


appropriate, either because of nonnormality or because from which 
subgroup variation changes systematically with the sub- OV PX 11-1 
group mean. The authors of the cited article suggested a of ms <S< Xeve) = .998 


square root transformation for this data (the family of Box- 
Cox transformations is y = x*, so = .5 here; Minitab will 
identify the best value of A). Transform the data as sug- 
gested, calculate control limits for X, R, and S charts, and 
check for the presence of any out-of-control signals. 


This suggests that an alternative chart for controlling 
process variation involves plotting the sample variances 
and using the control limits 


LCL = 874% ,09,,-,/(n — 1) 


des dim observations UCL = 8? x91, 1/1 — 1) 
200 14 31 12 Construct the corresponding chart for the data of 
250 22 13 9 Exercise 11. [Hint: The lower- and upper-tailed chi-squared 
oe i critical values for 5 df are .210 and 20.515, respectively. ] 


16.4 Control Charts for Attributes 


The term attribute data is used in the quality control literature to describe two 


situations: 


1. Each item produced is either defective or nondefective (conforms to specifica- 


tions or does not). 


2. A single item may have one or more defects, and the number of defects is 


determined. 


In the former case, a control chart is based on the binomial distribution; in the latter 


case, the Poisson distribution is the basis for a chart. 
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The p Chart for Fraction Defective 


Suppose that when a process is in control, the probability that any particular item 
is defective is p (equivalently, p is the long-run proportion of defective items for an 
in-control process) and that different items are independent of one another with 
respect to their conditions. Consider a sample of n items obtained at a particular 
time, and let X be the number of defectives and p = X/n. Because X has a binomial 
distribution, E(X) = np and V(X) = np(1 — p), so 


‘ ~,_ PU —p) 
E@) =p Vp) = 
Also, if np = 10 and n(1 — p) = 10, p has approximately a normal distribution. 

In the case of known p (or a chart based on target value), the control limits are 


= {= 
LCL =p see UCL =p+3 pa) 


If each sample consists of n items, the number of defective items in the ith sample is 
x, and p; = x;,/n, then pj, p>, P3,... are plotted on the control chart. 

Usually the value of p must be estimated from the data. Suppose that k samples 
from what is believed to be an in-control process are available, and let 


The estimate p is then used in place of p in the aforementioned control limits. 


The p chart for the fraction of defective items has its center line at height p 


and control limits 
=] = 
LCL = p - 3 [pL — Pp) 
n 
= 
UCL =p +3, —- 


If LCL is negative, it is replaced by 0. 


EXAMPLE 16.6 A sample of 100 cups from a particular dinnerware pattern was selected on each of 
25 successive days, and each was examined for defects. The resulting numbers of 
unacceptable cups and corresponding sample proportions are as follows: 


Day (i) 1 2 3 4 5 6 7 8 9 10 211 12 = «13 
Xx; 7 4 3 6 4 9 6 7 5 3 7 8 4 
DP; 07 04 «03 06.04 09 06 07 Ss 0B— «T—s«iwBs«—«4 
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Assuming that the process was in control during this period, let’s establish control 
limits and construct a p chart. Since Yp,; = 1.52, p = 1.52/25 = .0608 and 


LCL = .0608 — 3 \/(.0608)(.9392)/100 = .0608 — .0717 0109 


UCL = .0608 + 3 \/(.0608)(.9392)/100 = .0608 + .0717 = .1325 


The LCL is therefore set at 0. The chart pictured in Figure 16.5 shows that all points 
are within the control limits. This is consistent with an in-control process. 


PA 
UCL 
e 
10-4 
e e 
e e 
e e e e e e 
e e e e e e 
05 4 e 
e e iJ e 
e e 
e 
LCL — 
0 5 10 15 20 25 
Figure 16.5 Control chart for fraction-defective data of Example 16.6 & 


The c Chart for Number of Defectives 


We now consider situations in which the observation at each time point is the number 
of defects in a unit of some sort. The unit may consist of a single item (e.g., one auto- 
mobile) or a group of items (e.g., blemishes on a set of four tires). In the second case, 
the group size is assumed to be the same at each time point. 

The control chart for number of defectives is based on the Poisson probability 
distribution. Recall that if Y is a Poisson random variable with parameter p, then 


EKY=p VY=m oy=Ve 


Also, Y has approximately a normal distribution when p is large (uw = 10 will suffice 
for most purposes). Furthermore, if Y,, Y,,..., Y,, are independent Poisson variables 
with parameters [1,, M>,..., M,, it can be shown that Y, + --- + Y, has a Poisson 
distribution with parameter w, + -:: + yw, In particular, if w, = ++: = wu, = p (the 
distribution of the number of defects per item is the same for each item), then the 
Poisson parameter is nj. 

Let ps denote the Poisson parameter for the number of defects in a unit (it is 
the expected number of defects per unit). In the case of known w (or a chart based 
on a target value), 


LCL=y-3Vm  UCL=y+3Vu 


With x, denoting the total number of defects in the ith unit (i = 1, 2, 3,...), then 
points at heights x,, x,,.x,... are plotted on the chart. Usually the value of 4 must 
be estimated from the data. Since E(X,) = p, it is natural to use the estimate w = x 
(based on x), X,,..., X;). 
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The c chart for the number of defectives in a unit has center line at x and 
LCL =x —3Vx 
UCL = x% + 3Vx 

If LCL is negative, it is replaced by 0. 


EXAMPLE 16.7. A company manufactures metal panels that are baked after first being coated with a 
slurry of powdered ceramic. Flaws sometimes appear in the finish of these panels, and 
the company wishes to establish a control chart for the number of flaws. The number 
of flaws in each of the 24 panels sampled at regular time intervals are as follows: 


7 10 9 12 13 6 13 7 5 11 8 10 
13 9 21 10 6 8 3 12 7 11 14 10 


with =x; = 235 and f = x = 235/24 = 9.79. The control limits are 


LCL = 9.79 — 3V9.79 = .40 UCL = 9.79 + 3V9.79 = 19.18 


The control chart is in Figure 16.6. The point corresponding to the fifteenth panel 
lies above the UCL. Upon investigation, the slurry used on that panel was discovered 
to be of unusually low viscosity (an assignable cause). Eliminating that observation 
gives x = 214/23 = 9.30 and new control limits 


LCL = 9.30 — 3V/9.30 = .15 UCL = 9.30 + 3V9.30 = 18.45 


xy * 
20 - Original UCL 
Final UCL 
15 4 
e 
e e e 
e e 
e e 
10 e e e e 
e e 
e e 
e e e 
e e 
54 e a 
Original LCL 
. 7 Final LCL 
a Sample 
Cee ae eG Get Oe = 
0 5 10 15 20 a 


Figure 16.6 Control chart for number of flaws data of Example 16.7 


The remaining 23 observations all lie between these limits, indicating an in-control 
process. a 


Control Charts Based on Transformed Data 


The use of 3-sigma control limits is presumed to result in P(statistic < LCL) ~ 
P (statistic > UCL) ~ .0013 when the process is in control. However, when p is 
small, the normal approximation to the distribution of p = X/n will often not be 
very accurate in the extreme tails. Table 16.3 gives evidence of this behavior for 
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Table 16.3 In-Control Probabilities for a p Chart 
Dp n P(p < LCL) P(p > UCL) P(out-of-control point) 
10 100 .00003 .00198 .00201 
10 200 .00048 .00299 .00347 
10 400 .00044 .00171 .00215 
05 200 .00004 .00266 .00270 
05 400 .00020 .00207 .00227 
05 600 .0003 1 .00189 .00220 
02 600 .00007 .00275 .00282 
02 800 .00036 .00374 .00410 
02 1000 .00023 .00243 .00266 


selected values of p and n (the value of p is used to calculate the control limits). In 
many cases, the probability that a single point falls outside the control limits is very 
different from the nominal probability of .0026. 

This problem can be remedied by applying a transformation to the data. Let 
h(X) denote a function applied to transform the binomial variable X. Then /(-) 
should be chosen so that A(X) has approximately a normal distribution and this 
approximation is accurate in the tails. A recommended transformation is based on 
the arcsin (i.e., sin~') function: 


Y = A(X) = sin“ '(VX/n) 

Then Y is approximately normal with mean value sin~!(\Vp) and variance 1/(4n); 
note that the variance is independent of p. Let y, = sin~!(Vx,/n). Then points on 
the control chart are at heights y,, y,,.... For known n, the control limits are 


LCL = sin“(Vp) — 3V1/(4n) = UCL = sin7!(V/p) + 3V'1/(4n) 


When p is not known, sin~!(\/p) is replaced by y. 

Similar comments apply to the Poisson distribution when pw is small. The 
suggested transformation is Y = h(X) = 2\X, which has mean value 2 and vari- 
ance 1. Resulting control limits are 2/1 + 3 when pw is known and ¥ + 3 otherwise. 
The book Statistical Methods for Quality Improvement listed in the chapter bibliography 
discusses these issues in greater detail. 


EXERCISES Section 16.4 (21-28) 


21. 


22. 


On each of the previous 25 days, 100 electronic devices 

of a certain type were randomly selected and subjected to 

a severe heat stress test. The total number of items that 

failed to pass the test was 578. 

a. Determine control limits for a 3-sigma p chart. 

b. The highest number of failed items on a given day 
was 39, and the lowest number was 13. Does either of 
these correspond to an out-of-control point? Explain. 


A sample of 200 ROM computer chips was selected on 
each of 30 consecutive days, and the number of noncon- 
forming chips on each day was as follows: 10, 18, 24, 17, 


23. 


24. 


25. 


37, 19, 7, 25, 11, 24, 29, 15, 16, 21, 18, 17, 15, 22, 12, 
20, 17, 18, 12, 24, 30, 16, 11, 20, 14, 28. Construct a 
p chart and examine it for any out-of-control points. 


When n = 150, what is the smallest value of p for which 
the LCL in a p chart is positive? 


Refer to the data of Exercise 22, and construct a control 
chart using the sin~! transformation as suggested in 
the text. 


The accompanying observations are numbers of defects in 
25 1-square-yard specimens of woven fabric of a certain 
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type: 3, 7, 5, 3, 4, 2, 8, 4, 3, 3, 6, 7, 2, 3, 2, 4, 7, 3, 2, 4, 4, Area No. of 
1, 5, 4, 6. Construct a c chart for the number of defects. Panel Examined Blemishes 
26. For what x values will the LCL in a c chart be negative? 1 8 3 
27. In some situations, the sizes of sampled specimens vary, 2 6 2 
and larger specimens are expected to have more defects 3 8 3 
than smaller ones. For example, sizes of fabric samples 4 8 2 
inspected for flaws might vary over time. Alternatively, 5 1.0 5 
the number of items inspected might change with 6 1.0 5 
time. Let 7 8 10 
8 1.0 12 
the number of defects observed at time i 9 6 4 
on size of entity inspected at time i 10 6 2 
11 6 1 
=“ 12 8 3 
8i 13 8 5 
14 1.0 4 
where “size” might refer to area, length, volume, or 15 1.0 6 
simply the number of items inspected. Then a w chart 16 1.0 12 
plots u,, u,,..., has center line wu, and the control limits 17 8 3 
for the ith observations are u + 3Vu/g,. 18 6 3 
Painted panels were examined in time sequence, 19 6 5 
and for each one, the number of blemishes in a speci- 20 6 1 
fied sampling region was determined. The surface area 
(ft?) of the region examined varied from panel to panel. 28. Construct a control chart for the data of Exercise 25 by 
Results are given below. Construct a u chart. using the transformation suggested in the text. 


16.5 CUSUM Procedures 


A defect of the traditional X chart is its inability to detect a relatively small change 
in a process mean. This is largely a consequence of the fact that whether a process 
is judged out of control at a particular time depends only on the sample at that time, 
and not on the past history of the process. Cumulative sum (CUSUM) control 
charts and procedures have been designed to remedy this defect. 

There are two equivalent versions of a CUSUM procedure for a process mean, 
one graphical and the other computational. The computational version is used almost 
exclusively in practice, but the logic behind the procedure is most easily grasped by 
first considering the graphical form. 


The V-Mask 


Let tp denote a target value or goal for the process mean, and define cumulative sums by 
S| =X, — Mo 


S. = (1 — Mo) + 2 — My) = > &, — Mo) 


5;= @, = pa) + — Ho) = e- Mo) 
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(in the absence of a target value, x is used in place of jy). These cumulative sums 
are plotted over time. That is, at time /, we plot a point at height S,. At the current 
time point r, the plotted points are (1, S,), (2, S5), (3, S3),.... (7, S,). 

Now a V-shaped “mask” is superimposed on the plot, as shown in Figure 16.7. 
The point 0, which lies a distance d behind the point at which the two arms of 
the mask intersect, is positioned at the current CUSUM point (r, S,). At time r, the 
process is judged out of control if any of the plotted points lies outside the V-mask— 
either above the upper arm or below the lower arm. When the process is in control, 
the x,’s will vary around the target value j1,, so successive Ss should vary around 0. 
Suppose, however, that at a certain time, the process mean shifts to a value larger than 
the target. From that point on, differences x, — 1, will tend to be positive, so that suc- 
cessive S/’s will increase and plotted points will drift upward. If a shift has occurred 
prior to the current time point r, there is a good chance that (r, S,) will be substantially 
higher than some other points in the plot, in which case these other points will be 
below the lower arm of the mask. Similarly, a shift to a value smaller than the target 
will subsequently result in points above the upper arm of the mask. 


S, 4 S, 4 


Current _ 
point — 


(c) (d) 


Figure 16.7 CUSUM plots: (a) successive points (/, S)) in a CUSUM plot; (b) a V-mask with 
0 = (r, S); (oan in-control process; (d) an out-of-control process 


Any particular V-mask is determined by specifying the “lead distance” d and 
“half-angle” 6, or, equivalently, by specifying d and the length h of the vertical line 
segment from 0 to the lower (or to the upper) arm of the mask. One method for decid- 
ing which mask to use involves specifying the size of a shift in the process mean 
that is of particular concern to an investigator. Then the parameters of the mask are 
chosen to give desired values of a and B, the false-alarm probability and the probabil- 
ity of not detecting the specified shift, respectively. An alternative method involves 
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selecting the mask that yields specified values of the ARL (average run length) both 
for an in-control process and for a process in which the mean has shifted by a desig- 
nated amount. After developing the computational form of the CUSUM procedure, 
we will illustrate the second method of construction. 


EXAMPLE 16.8 A wood products company manufactures charcoal briquettes for barbecues. It pack- 
ages these briquettes in bags of various sizes, the largest of which is supposed to 
contain 40 Ibs. Table 16.4 displays the weights of bags from 16 different samples, 
each of size n = 4. The first 10 of these were drawn from a normal distribution with 
= My = 40 and o = .5. Starting with the eleventh sample, the mean has shifted 
upward to w = 40.3. 


Table 16.4 Observations, x’s, and Cumulative Sums for Example 16.8 


Sample 
Number Observations x >=(x; — 40) 
1 40.77 39.95 40.86 39.21 40.20 .20 
2 38.94 39.70 40.37 39.88 39.72 —.08 
3 40.43 40.27 40.91 40.05 40.42 34 
4 39.55 40.10 39.39 40.89 39.98 32 
5 41.01 39.07 39.85 40.32 40.06 38 
6 39.06 39.90 39.84 40.22 39.76 14 
7 39.63 39.42 40.04 39.50 39.65 = 2 
8 41.05 40.74 40.43 39.40 40.41 .20 
9 40.28 40.89 39.61 40.48 40.32 22, 
10 39.28 40.49 38.88 40.72 39.84 36 
11 40.57 40.04 40.85 40.51 40.49 85 
12 39.90 40.67 40.51 40.53 40.40 1.25 
13 40.70 40.54 40.73 40.45 40.61 1.86 
14 39.58 40.90 39.62 39.83 39.98 1.84 
15 40.16 40.69 40.37 39.69 40.23 2.07 
16 40.46 40.21 40.09 40.58 40.34 2.41 


Figure 16.8 displays an X chart with control limits 
p+ 307 = 40 + 3 - (.5/V4) = 40 + .75 


x A 
4 UCL 
aa e 
= e e . e 
ps e 
et e 
40.204 e ° 
al e 
————— 
| e 
4 “ e 
39.60 4 i 
4 LCL 
39.00 - | Sample 
T T T T T 
3 6 9 12 15 number 


Figure 16.8 X control chart for the data of Example 16.8 
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No point on the chart lies outside the control limits. This chart suggests a stable 
process for which the mean has remained on target. 

Figure 16.9 shows CUSUM plots with a particular V-mask superimposed. The 
plot in Figure 16.9(a) is for current time r = 12. All points in this plot lie inside the 
arms of the mask. However, the plot for r = 13 displayed in Figure 16.9(b) gives an 
out-of-control signal. The point falling below the lower arm of the mask suggests an 
increase in the value of the process mean. The mask at r = 16 is even more emphatic 
in its out-of-control message. This is in marked contrast to the X chart. 


CUSUM 4 
CUSUM 4 
2.05 2.0 4 
1.0 i . 1.0 1 
4 ‘i | 
4 ee e e 4 Ss 
4 e e * 4 e 
0.0 0.0 5 
| e e fil e 
7 Sample 7 Sample 
T T T T T T e, b T T T T T T = 
0 3 6 9 12 15 oer 0 3 6 9 12° «15 en 
(a) (b) 


Figure 16.9 CUSUM plots and V-masks for data of Example 16.8: (a) V-mask at time r = 12, 
process in control; (b) V-mask at time r = 13, out-of-control signal 


A Computational Version 


The following computational form of the CUSUM procedure is equivalent to the 
previous graphical description. 


Let dy) = e€, = 0, and calculate d,, d,, d;,... and e,, @,, é3,... recursively, using 
the relationships 


d, = max{0, d,_, + @, — (uy) + B)) 


a ee 
e, = max(0, e,_, — @, — (ty — 4] ' 


Here the symbol k denotes the slope of the lower arm of the V-mask, and its 
value is customarily taken as A/2 (where A is the size of a shift in on which 
attention is focused). 

If at current time r either d, > h or e, > h, the process is judged to be out 
of control. The first inequality suggests that the process mean has shifted to a 
value greater than the target, whereas e, > h indicates a shift to a smaller value. 


EXAMPLE 16.9 Reconsider the charcoal briquette data displayed in Table 16.4 of Example 16.8. The 
target value is 4) = 40, and the size of a shift to be quickly detected is A = .3. Thus 


k= — =.15 My + k = 40.15 My — k = 39.85 
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so 


d, = max{0, d,_, + (x, — 40.15)] 


max[0, e,_, — (x, — 39.85)] 


e 


Calculations of the first few d,s proceeds as follows: 

dy = 0 

d, = max(0, dy + &, — 40.15)] 
= max[0, 0 + (40.20 — 40.15)] 
= .05 

d, = max[(0, d, + (x, — 40.15)] 
= max[0, .05 + (39.72 — 40.15)] 
=0 

d, = max[(0, d, + (, — 40.15)] 
= max[0, 0 + (40.42 — 40.15)] 
= .27 


The remaining calculations are summarized in Table 16.5. 


Table 16.5 CUSUM Calculations for Example 16.9 


Sample 
Number x, x, — 40.15 d, xX, — 39.85 e; 
1 40.20 05 05 35 0 
2 39.72 — 43 0 = 13 3 
3 40.42 21 27 7 0) 
4 39.98 =I7 10 3 0 
5 40.06 —.09 O01 21 0 
6 39.76 —.39 0 —.09 .09 
7 39.65 —.50 0 —.20 29 
8 40.41 .26 .26 56 0 
9 40.32 A7 43 47 0) 
10 39.84 =231 12 —.01 O1 
11 40.49 34 46 .64 0 
12 40.40 25 yall 2) 0 
13 40.61 46 1.17 .76 0) 
14 39.98 =17 1.00 3 0 
15 40.23 .08 1.08 38 0 
16 40.34 19 1,27 49 0 


The value h = .95 gives a CUSUM procedure with desirable properties— 
false alarms (incorrect out-of-control signals) rarely occur, yet a shift of A = .3 
will usually be detected rather quickly. With this value of h, the first out-of-control 
signal comes after the 13th sample is available. Since d,, = 1.17 > .95, it appears 
that the mean has shifted to a value larger than the target. This is the same message 
as the one given by the V-mask in Figure 16.9(b). a 
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To demonstrate equivalence, again let r denote the current time point, so that 
X),X5,...,X, are available. Figure 16.10 displays a V-mask with the point labeled 
0 at (r, S,). The slope of the lower arm, which we denote by k, is h/d. Thus the 
points on the lower arm above r,r — 1, r — 2,... are at heights $.— h, S,-—h—k, 
S,— h — 2k, and so on. 


Ss. 


h 
Slope = k ==> 
SS hie pee . 
S, CO Bop pesssceeeesesssseeeesees 
SA DR Se PHS ee aeesaases 


"3 { 
/ = r—-1 


Figure 16.10 A V-mask with slope of lower arm = k 


The process is in control if all points are on or between the arms of the mask. 
We wish to describe this condition algebraically. To do so, let 


1 
T, = >), — (uy + 8) P= 1,2, 3s cag? 


i=1 


The conditions under which all points are on or above the lower arm are 


S,-h=S, (trivially satisfied) i.e., S.<S, +h 
S.-h-k=S._, ie, S.=S_, th+k 
‘= h=2= 5, ie, S,<S, +h + 2k 


Now subtract rk from both sides of each inequality to obtain 


S,—rk=S,—rkt+h ie.,T,=T.+h 
S.-rk<S,_,-(r-Vk +h ie. T,=T._, +h 
S,-rk<S,_,-(r-Dkt+h ie.,T.=T,,+h 


Thus all plotted points lie on or above the lower arm if and only if (iff) T, — T, = h, 
T, — T,_, Sh, T, — T,_, =h, and so on. This is equivalent to 


T,— min(7;, T>,...,7,) <h 


In a similar manner, if we let 


v= Si (1 — K)] = S, + rk 


i=1 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


706 CHAPTER 16 Quality Control Methods 


it can be shown that all points lie on or below the upper arm iff 
max(V,,...,V,) —V,=<h 
If we now let 


d,=T,— min(T),..., T,) 


qT 


e, = max(V,,..., V,) — V, 


it is easily seen that d,, d,,... and é,, e,,... can be calculated recursively as illustrated 
previously. For example, the expression for d, follows from consideration of two cases: 
1. min(7,,..., 7.) = T,, whence d, = 0 
2. min(7),...,7,) = min(7),..., T,_,), so that 
d,=T,— min(T,,..., T._,) 
=X,— (uo tk) +T,_, — min(7),..., 7,_,) 
=%,— (ig +H +4, 


Since d, cannot be negative, it is the larger of these two quantities. 


Designing a CUSUM Procedure 


Let A denote the size of a shift in w that is to be quickly detected using a CUSUM 
procedure.* It is common practice to let k = A/2. Now suppose a quality control 
practitioner specifies desired values of two average run lengths: 


1. ARL when the process is in control (44 = [o) 
2. ARL when the process is out of control because the mean has shifted by A (u = 
My + A or w= py — A) 


A chart developed by Kenneth Kemp (‘‘The Use of Cumulative Sums for Sampling 
Inspection Schemes,” Applied Statistics, 1962: 23), called a nomogram, can then 
be used to determine values of and n that achieve the specified ARLs.’ This chart 
is shown as Figure 16.11. The method for using the chart is described in the accom- 
panying box. Either the value of o must be known or an estimate is used in its place. 


Using the Kemp Nomogram 
1. Locate the desired ARLs on the in-control and out-of-control scales. 
Connect these two points with a line. 
2. Note where the line crosses the k’ scale, and solve for n using the equation 
INP 
ki = ee 
a/Vn 
Then round n up to the nearest integer. 


3. Connect the point on the k’ scale with the point on the in-control ARL scale 
using a second line, and note where this line crosses the h’ scale. Then 


h=(o/Vn)-h'. 


* This contrasts with previous notation, where A represented the number of standard deviations by which 
p changed. 
+ The word nomogram is not specific to this chart; nomograms are used for many other purposes. 
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kb’ 
=i 

E In-control ARL 
E —100 
Ei. [ 

E Out-of-control ARL r 

E 1.0 fF 

E t- 150 
- 0.9 i 

F 200 
E 0.8 250 
E 300 
- 0.7 

E 350 
E 400 
E- 0.6 450 
E 500 
05 vn 
E 700 
oS 800 
E 900 
— 0.3 1000 
E 0:2 


Figure 16.11 The Kemp nomogram* 


The value h = .95 was used in Example 16.9. In that situation, it follows that the 
in-control ARL is 500 and the out-of-control ARL (for A = .3) is 7. 


EXAMPLE 16.10 The target value for the diameter of the interior core of a hydraulic pump is 2.250 in. 
If the standard deviation of the core diameter is @ = .004, what CUSUM procedure 
will yield an in-control ARL of 500 and an ARL of 5 when the mean core diameter 
shifts by the amount of .003 in.? 

Connecting the point 500 on the in-control ARL scale to the point 5 on the 
out-of-control ARL scale and extending the line to the k’ scale on the far left in 
Figure 16.11 gives k’ = .74. Thus 


A/2 0015 


ki’ =.74= = = 375Vn 
a/Vn — .004/V/n 
SO 
74 
Vin = aa 1.973 n = (1.973)? = 3.894 


The CUSUM procedure should therefore be based on the sample size n = 4. Now con- 
necting .74 on the k’ scale to 500 on the in-control ARL scale gives h’ = 3.2, from which 


h = (o/Vn) - (3.2) = (.004/V/4)(3.2) = .0064 


An out-of-control signal results as soon as either d, > .0064 or e, > .0064. a 


* SOURCE: The Kemp nomogram—Kemp, Kenneth W., “The Use of Cumulative Sums for Sampling 
Inspection Schemes,” Applied Statistics, Vol. XI, 1 962: 23. 
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We have discussed CUSUM procedures for controlling process location. There 
are also CUSUM procedures for controlling process variation and for attribute data. 
The chapter references should be consulted for information on these procedures. 


EXERCISES Section 16.5 (29-32) 


29. 


Containers of a certain treatment for septic tanks are sup- 
posed to contain 16 oz of liquid. A sample of five contain- 
ers is selected from the production line once each hour, 
and the sample average content is determined. Consider 
the following results: 15.992, 16.051, 16.066, 15.912, 
16.030, 16.060, 15.982, 15.899, 16.038, 16.074, 16.029, 


.7510, .7490, .7497, .7488, .7504, .7516, .7472, .7489, 
.7483, .7471, .7498, .7460, .7482, .7470, .7493, .7462, 
.7481. Use the computational form of the CUSUM proce- 
dure with h = .003 to see whether the process mean 
remained on target throughout the time of observation. 


The standard deviation of a certain dimension on an air- 


15.935, 16.032, 15.960, 16.055. Using A =.10 and 
h = .20, employ the computational form of the CUSUM 
procedure to investigate the behavior of this process. 


craft part is .005 cm. What CUSUM procedure will give 
an in-control ARL of 600 and an out-of-control ARL of 4 
when the mean value of the dimension shifts by .004 cm? 
30. The target value for the diameter of a certain type of 32. 
driveshaft is .75 in. The size of the shift in the average 
diameter considered important to detect is .002 in. 
Sample average diameters for successive groups of n = 4 
shafts are as follows: .7507, .7504, .7492, .7501, .7503, 


When the out-of-control ARL corresponds to a shift of 
1 standard deviation in the process mean, what are the 
characteristics of the CUSUM procedure that has ARLs 
of 250 and 4.8, respectively, for the in-control and out- 
of-control conditions? 


16.6 Acceptance Sampling 


Items coming from a production process are often sent in groups to another company 
or commercial establishment. A group might consist of all units from a particular 
production run or shift, in a shipping container of some sort, sent in response to a 
particular order, and so on. The group of items is usually called a Jot, the sender is 
referred to as a producer, and the recipient of the lot is the consumer. Our focus will 
be on situations in which each item is either defective or nondefective, with p denot- 
ing the proportion of defective units in the lot. The consumer would naturally want 
to accept the lot only if the value of p is suitably small. Acceptance sampling is that 
part of applied statistics dealing with methods for deciding whether the consumer 
should accept or reject a lot. 

Until quite recently, control chart procedures and acceptance sampling tech- 
niques were regarded by practitioners as equally important parts of quality control 
methodology. This is no longer the case. The reason is that the use of control 
charts and other recently developed strategies offers the opportunity to design 
quality into a product, whereas acceptance sampling deals with what has already 
been produced and thus does not provide for any direct control over process qual- 
ity. This led the late American quality control expert W. E. Deming, a major force 
in persuading the Japanese to make substantial use of quality control methodol- 
ogy, to argue strongly against the use of acceptance sampling in many situations. 
In a similar vein, the recent book by Ryan (see the chapter bibliography) devotes 
several chapters to control charts and mentions acceptance sampling only in pass- 
ing. As a reflection of this deemphasis, we content ourselves here with a brief 
introduction to basic concepts. 
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Single-Sampling Plans 


The most straightforward type of acceptance sampling plan involves selecting a 
single random sample of size n and then rejecting the lot if the number of defectives 
in the sample exceeds a specified critical value c. Let the rv X denote the number 
of defective items in the lot and A denote the event that the lot is accepted. Then 
P(A) = P(X S c) is a function of p; the larger the value of p, the smaller will be the 
probability of accepting the lot. 

If the sample size n is large relative to N, P(A) is calculated using the 
hypergeometric distribution (the number of defectives in the lot is Np): 


("”) ae £ 
PX <0) = SAban Np. => = — 


x=0 x=0 N 
n 


When 7 is small relative to N (the rule of thumb suggested previously was n = .O5SN, 
but some authors employ the less conservative rule n= .10N), the binomial 
distribution can be used: 


P(X Sc)= > b(x; n, p) = > (") p(l — py" 


x=0 


Finally, if P(A) is large only when p is small (this depends on the value of c), the 
Poisson approximation to the binomial distribution is justified: 


P(X Sc) ~ De: np) = > — 


! 
x=0 x=0 x: 


The behavior of a sampling plan can be nicely summarized by graphing P(A) 
as a function of p. Such a graph is called the operating characteristic (OC) curve 
for the plan. 


EXAMPLE 16.11 Consider the sampling plan with c = 2 and n = 50. If the lot size N exceeds 1000, 
the binomial distribution can be used. This gives 


P(A) = P(X S 2) = (1 — p)° + 50p(1 — p)*? + 1255p7(1 — p)*® 


The accompanying table shows P(A) for selected values of p, and the corresponding 
operating characteristic (OC) curve is shown in Figure 16.12. 


p .O1 02 03 04 05 06 07 ~ 08 09 10 12 15 
P(A) | .986 .922 811 677 541 416 311 .226 .161 112 051 014 


Figure 16.12 OC curve for sampling plan with c = 2, n = 50 & 
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The OC curve for the plan of Example 16.11 has P(A) near 1 for p very close to 0. 
However, in many applications a defective rate of 8% [for which P(A) = .226] 
or even just 5% [P(A) = .541] would be considered excessive, in which case the 
acceptance probabilities are too high. Increasing the critical value c while holding n 
fixed gives a plan for which P(A) increases at each p (except 0 and 1), so the new OC 
curve lies above the old one. This is desirable for p near O but not for larger values 
of p. Holding c constant while increasing n gives a lower OC curve, which is fine for 
larger p but not for p close to 0. We want an OC curve that is higher for very small 
p and lower for larger p. This requires increasing n and adjusting c. 


Designing a Single-Sample Plan 
An effective sampling plan is one with the following characteristics: 


1. It has a specified high probability of accepting lots that the producer considers 
to be of good quality. 

2. It has a specified low probability of accepting lots that the consumer considers 
to be of poor quality. 


A plan of this sort can be developed by proceeding as follows. Let’s designate two 
different values of p, one for which P(A) is a specified value close to | and the other 
for which P(A) is a specified value near 0. These two values of p—say, p, and p,— 
are often called the acceptable quality level (AQL) and the lot tolerance percent 
defective (LTPD). That is, we require a plan for which 


1. P(A) =1—a_ when p = p, = AQL (a small) 
2. P(A) = 8B when p = p, = LTPD (f small) 


This is analogous to seeking a hypothesis testing procedure with specified type I error 
probability a and specified type II error probability 8. For example, we might have 


AQL=.01 a@=.05 (P(A) =.95) 

LTPD = .045 B=.10 (P(A) = .10) 
Because X is discrete, we must typically be content with values of n and c that 
approximately satisfy these conditions. 


Table 16.6 gives information from which n and c can be determined in the 
case a = .05, B = .10. 


Table 16.6 Factors for Determining n and c for a Single-Sample Plan with 


a = 05, B =.10. 
c np, mp, P2/P1 c mp, mp, P2/Py 
0 051 2.30 45.10 8 4.695 12.99 2.77 
1 355 3.89 10.96 9 5.425 14.21 2.62 
2 818 5.32 6.50 10 6.169 15.41 2.50 
3 1.366 6.68 4.89 ll 6.924 16.60 2.40 
4 1.970 7.99 4.06 12 7.690 17.78 2.31 
5 2.613 9.28 3.55 13 8.464 18.86 2.24 
6 3.285 10.53 3.21 14 9.246 20.13 2.18 
7 3.981 11.77 2.96 15 10.040 21.29 2.12 
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EXAMPLE 16.12 —Let’s determine a plan for which AQL = p, = .01 and LTPD = p, = .045. The ratio 
of p, to p, is 
LTPD P2 045 


-2-= = 4.50 
AQL p, Ol 


This value lies between the ratio 4.89 given in Table 16.6, for which c = 3, and 4.06, 
for which c = 4. Once one of these values of c is chosen, n can be determined either 
by dividing the np, value in Table 16.6 by p, or via np,/p,. Thus four different plans 
(two values of c, and for each two values of n) give approximately the specified 
value of a and B. Consider, for example, using c = 3 and 


np, 1.366 
n= = = 136.6 ~ 137 
P\ 01 


Then 
a = 1 — P(X <3 when p = p,) = .050 


(the Poisson approximation with uw = 1.37 also gives .050) and 
B = P(X S3 when p = p,) = .131 


The plan with c = 4 and n determined from np, = 7.99 has n = 178, a = .034, and 
B = .094. The larger sample size results in a plan with both a and 6 smaller than 
the corresponding specified values. | 


The book by Douglas Montgomery cited in the chapter bibliography contains a chart 
from which c and n can be determined for any specified a and B. 

It may happen that the number of defective items in the sample reaches c + 1 
before all items have been examined. For example, in the case c = 3 and n = 137, it 
may be that the 125th item examined is the fourth defective item, so that the remaining 
12 items need not be examined. However, it is generally recommended that all items 
be examined even when this does occur, in order to provide a lot-by-lot quality history 
and estimates of p over time. 


Double-Sampling Plans 


In a double-sampling plan, the number of defective items x, in an initial sample of 
size n, is determined. There are then three possible courses of action: Immediately 
accept the lot, immediately reject the lot, or take a second sample of n, items and 
reject or accept the lot depending on the total number x, + x, of defective items in 
the two samples. Besides the two sample sizes, a specific plan is characterized by 
three further numbers—c,, 7,, and c,—as follows: 


1. Reject the lot if x, = r,. 
2. Accept the lot if x, < c;. 


3. If c; <x, <7, take a second sample; then accept the lot if x, + x, <c, and 
reject it otherwise. 


EXAMPLE 16.13 Consider the double-sampling plan with n, = 80,n, = 80,c, =2,7r, =5, and 
c, = 6. Thus the lot will be accepted if (1)x,=0,1, or 2;(2)x,=3 and 
x, = 0, 1, 2, or 3; or (3) x, = 4 and x, = 0, 1, or 2. 
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Assuming that the lot size is large enough for the binomial approximation to 
apply, the probability P(A) of accepting the lot is 


P(A) = P(X, = 0, 1, or 2) + P(X, = 3, X, = 0,1, 2, or 3) 
+ P(X, = 4, X, = 0, 1, or 2) 


2. 3 
= >)d(x,; 80, p) + b(3; 80, p) >) b(x,; 80, p) 


x,=0 x,=0 


2: 
+ b(4; 80, p) © b(x>; 80, p) 


x,=0 


Again the graph of P(A) versus p is the plan’s OC curve. The OC curve for this plan 
appears in Figure 16.13. 


P(A) 
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0.7 
0.6 
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0.4 
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0.2 
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Figure 16.13 OC curve for the double-sampling plan of Example 16.13 13] 


One standard method for designing a double-sampling plan involves proceed- 
ing as suggested earlier for single-sample plans. Specify values p, and p, along with 
corresponding acceptance probabilities | — a and B. Then find a plan that satisfies 
these conditions. The book by Montgomery provides tables similar to Table 16.6 
for this purpose in the cases n, = n, and n, = 2n, with | — a = .95, B = .10. Much 
more extensive tabulations of plans are available in other sources. 

Analogous to standard practice with single-sample plans, it is recommended 
that all items in the first sample be examined even when the (r, + 1)st defective is 
discovered prior to inspection of the n,th item. However, it is customary to terminate 
inspection of the second sample if the number of defectives is sufficient to justify 
rejection before all items have been examined. This is referred to as curtailment in 
the second sample. Under curtailment, it can be shown that the expected number 
of items inspected in a double-sampling plan is smaller than the number of items 
examined in a single-sampling plan when the OC curves of the two plans are close to 
being identical. This is the major virtue of double-sampling plans. For more on these 
matters as well as a discussion of multiple and sequential sampling plans (which 
involve selecting items for inspection one by one rather than in groups), a book on 
quality control should be consulted. 


Rectifying Inspection and Other Design Criteria 


In some situations, sampling inspection is carried out using rectification. For 
single-sample plans, this means that each defective item in the sample is replaced 
with a satisfactory one, and if the number of defectives in the sample exceeds 
the acceptance cutoff c, all items in the lot are examined and good items are 
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substituted for any defectives. Let N denote the lot size. One important character- 
istic of a sampling plan with rectifying inspection is average outgoing quality, 
denoted by AOQ. This is the long-run proportion of defective items among those 
sent on after the sampling plan is employed. Now defectives will occur only 
among the N — n items not inspected in a lot judged acceptable on the basis of a 
sample. Suppose, for example, that P(A) = P(X Sc) = .985 when p = .01. Then, 
in the long run, 98.5% of the N — n items not in the sample will not be inspected, 
of which we expect 1% to be defective. This implies that the expected number of 
defectives in a randomly selected batch is (NV — n) - P(A) - p = .00985(N — n). 
Dividing this by the number of items in a lot gives average outgoing quality: 


(N —n)- P(A): p 
N 
= P(A): p ifN>>n 


AOQ = 


Because AOQ = 0 when either p = 0 or p = 1 [P(A) = 0 in the latter case], it follows 
that there is a value of p between 0 and 1| for which AOQ is a maximum. The maximum 
value of AOQ is called the average outgoing quality limit, AOQL. For example, 
for the plan with n = 137 and c = 3 discussed previously, AOQL = .0142, the value 
of AOQ at p ~ .02. 

Proper choices of n and c will yield a sampling plan for which AOQL is a 
specified small number. Such a plan is not, however, unique, so another condi- 
tion can be imposed. Frequently this second condition will involve the average 
(i.e., expected) total number inspected, denoted by ATI. The number of items 
inspected in a randomly chosen lot is a random variable that takes on the value n 
with probability P(A) and N with probability 1 — P(A). Thus the expected number 
of items inspected in a randomly selected lot is 


ATI =n: P(A) +N: (1 — P(A)) 


It is common practice to select a sampling plan that has a specified AOQL and, in 
addition, minimum ATI at a particular quality level p. 


Standard Sampling Plans 


It may seem as though the determination of a sampling plan that simultaneously sat- 
isfies several criteria would be quite difficult. Fortunately, others have already laid 
the groundwork in the form of extensive tabulations of such plans. MIL STD 105D, 
developed by the military after World War II, is the most widely used set of plans. A 
civilian version, ANSI/ASQC Z1.4, is quite similar to the military version. A third set 
of plans that is quite popular was developed at Bell Laboratories prior to World War II 
by two applied statisticians named Dodge and Romig. The book by Montgomery (see 
the chapter bibliography) contains a readable introduction to the use of these plans. 


EXERCISES Section 16.6 (33-40) 


33. Consider the single-sample plan with c = 2 and n = 50, 
as discussed in Example 16.11, but now suppose that the 
lot size is N = 500. Calculate P(A), the probability of 
accepting the lot, for p = .01, .02,...,.10, using the 
hypergeometric distribution. Does the binomial approxi- 
mation give satisfactory results in this case? 


34. 


A sample of 50 items is to be selected from a batch con- 
sisting of 5000 items. The batch will be accepted if the 
sample contains at most one defective item. Calculate the 
probability of lot acceptance for p = .01, .02,..., 10, and 
sketch the OC curve. 
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35. 


36. 


37. 


38. 


Refer to Exercise 34 and consider the plan with n = 100 
and c = 2. Calculate P(A) for p = .01, .02,..., .05, and 
sketch the two OC curves on the same set of axes. Which 
of the two plans is preferable (leaving aside the cost of 
sampling) and why? 


Develop a single-sample plan for which AQL = .02 and 
LTPD = .07 in the case a = .05, B = .10. Once values 
of n and c have been determined, calculate the achieved 
values of a and B for the plan. 


Consider the double-sampling plan for which both sample 
sizes are 50. The lot is accepted after the first sample if the 
number of defectives is at most 1, rejected if the number of 
defectives is at least 4, and rejected after the second sample 
if the total number of defectives is 6 or more. Calculate the 
probability of accepting the lot when p = .01, .05, and .10. 


Some sources advocate a somewhat more restrictive 
type of doubling-sampling plan in which r,; = c, + 1; 
that is, the lot is rejected if at either stage the (total) 


39. 


40. 


number of defectives is at least r, (see the book by 
Montgomery). Consider this type of sampling plan with 
n, = 50,n, = 100,c, = 1, and r, = 4. Calculate the 
probability of lot acceptance when p = .02, .05, and .10. 


Refer to Example 16.11, in which a single-sample plan 

with n = 50 and c = 2 was employed. 

a. Calculate AOQ for p = .01, .02,...,.10. What does 
this suggest about the value of p for which AOQ is a 
maximum and the corresponding AOQL? 

b. Determine the value of p for which AOQ is a maxi- 
mum and the corresponding value of AOQL. [Hint: 
Use calculus. ] 

c. For N = 2000, calculate ATI for the values of p given 
in part (a). 

Consider the single-sample plan that utilizes n = 50 and 

c = 1 when N = 2000. Determine the values of AOQ 

and ATI for selected values of p, and graph each of these 

against p. Also determine the value of AOQL. 


SUPPLEMENTARY EXERCISES (41-46) 


41. 


42. 


43. 


44. 


Observations on shear strength for 26 subgroups of test 
spot welds, each consisting of six welds, yield 
=x, = 10,980, Xs, = 402, and Xr, = 1074. Calculate 
control limits for any relevant control charts. 


The number of scratches on the surface of each of 
24 rectangular metal plates is determined, yielding the 
following data: 8, 1, 7, 5, 2, 0, 2, 3, 4, 3, 1, 2, 5, 7, 3, 4, 
6, 5, 2, 4, 0, 10, 2, 6. Construct an appropriate control 
chart, and comment. 


The following numbers are observations on tensile 
strength of synthetic fabric specimens selected from a 
production process at equally spaced time intervals. 
Construct appropriate control charts, and comment 
(assume an assignable cause is identifiable for any out- 
of-control observations). 


51.3 
51.0 
50.8 
50.6 
49.6 
313 
49.7 
51.8 
48.6 
49.6 
49.9 


51.7 
50.0 
51.1 
S11 
50.5 
52.0 
30:5 
50.3 
50.5 
49.8 
50.7 


49.5 12. 
49.3 13. 
49.0 14. 
49.0 15. 
50.9 16. 
50.3 17. 
50.3 18. 
50.0 19. 
50.7 20. 
50.5 21. 
49.8 22. 


49.6 
49.8 
50.4 
49.4 
50.7 
50.8 
48.5 
49.6 
50.9 
54.1 
50.2 


48.4 50.0 
51.2 49.7 
49.9 50.7 
49.5 49.0 
49.0 50.0 
49.5 50.9 
50.3 49.3 
50.6 49.4 
49.4 49.7 
49.8 48.5 
49.6 51.5 


—_ 
BSP PAAM aR YD 


= 


An alternative to the p chart for the fraction defective is 
the np chart for number defective. This chart has 


UCL = np +3\/npU —p), LCL = np —3\/np( — p), 
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45. 


46. 


and the number of defectives from each sample is plotted 
on the chart. Construct such a chart for the data of 
Example 16.6. Will the use of an np chart always give the 
same message as the use of a p chart (i.e., are the two 
charts equivalent)? 


Resistance observations (ohms) for subgroups of a certain 
type of register gave the following summary quantities: 


i Nn; Xx; S; ion; Xx; S; 
1 4 430.0 22.5 11 4 445.2 27.3 
2 4 418.2 20.6 12 4 430.1 22.2 
3 3 435.5 25.1 13, 4 427.2 24.0 
4 4 427.6 22.3 14 4 439.6 23.3 
5 4 444.0 21.5 15 3 415.9 31.2 
6 3 4314 28.9 16 4 419.8 27.5 
7 4 420.8 25.4 17 3 447.0 19.8 
8 4 4314 24.0 18 4 4344 23.7 
9 4 428.7) 21.2 19 4 422.2 25.1 

10 4 440.1 25.8 20 4 425.7 24.4 


Construct appropriate control limits. [Hint: Use x = 
&n,x,/Zn,; and s? = X(n,; — 1)s? / Xn, — 1] 

Let a be a number between 0 and 1, and define a sequence 
W,, Ws, W3,... by Wy = wand W, = aX, + (1 — a)W,_, 
for t = 1, 2,.... Substituting for W,_, its representation in 
terms of X,_, and W,_,, then substituting for W,_,, and so 
on, results in 


W, = aX, + a(l 


aX, +... 


+ a(1 — a) 'X, + 1 - a) 


The fact that W, depends not only on X, but also on 
averages for past time points, albeit with (exponentially) 
decreasing weights, suggests that changes in the process 
mean will be more quickly reflected in the W,’s than in 
the individual X,’s. 

a. Show that E(W,) = p. 

b. Let o? = V(W,), and show that 
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c. An exponentially weighted moving-average control 
chart plots the W,’s and uses control limits ry + 30, 
(or x in place of j19). Construct such a chart for the 
data of Example 16.9, using fu) = 40. 
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A-2 Appendix Tables 


Table A.1} Cumulative Binomial Probabilities 


B(x; n, p) = s b(y; n, p) 


aon=5 y=0 
Pp 
0.01 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95 0.99 
0 951 .774 590 = .328 237 168 .078 .031 010 002 .001 .000 000 .000 ~~ = .000 
1 999 O77 919 — .737 .633 528 337 188 .087  .031 .016 .007 = =.000 .000 = .000 
x 2 1.000 .999 991 .942 .896 .837 .683 .500 317 163 104.058) =.009-—.001_~=—-.000 
3 1.000 1.000 1.000  .993 984 .969 913 .812 663 472 .367 .263 = .081 .023 ~~ ~=.001 
4 1.000 1.000 1.000 1.000 999 998 .990 .969 922 832 .763 .672 410 .226 .049 
b. n = 10 
Pp 
0.01 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95 0.99 
0 904 599 349 .107 .056 028 .006 001 .000 = .000 = =.000 = =©.000) =©.000 §=.000=—.000 
1 .996 914 .736 376 244 149 .046 O11 .002. .000 .000 = .000 000 §=.000~ = .000 
2 1.000 .988 .930 .678 526 383 .167 .055 .012  .002 .000 = =.000 =.000 .000 = .000 
3 1.000 .999 .987 .879 .776 .650 382 .172 055 O11 .004 001 .000 .000 = .000 
4 1.000 1.000 998 .967 922 .850 .633 TT 166 .047 .020 .006 .000 .000 = .000 
x 
5 1.000 1.000 1.000 .994 .980 953 834 .623 367 150 078 §=.033. =.002.-—- .000_~—-.000 
6 1.000 1.000 1.000 .999 .996 989 945 828 618 .350 .224 .121 013 .001 = .000 
7 1.000 1.000 1.000 1.000 1.000 998 988 945 833 617 .474 = .322) »=©=.070 =3=—.012_~—-«.000 
8 1.000 1.000 1.000 1.000 1.000 1.000 998 .989 954 .851 .756 .624 .264 086 .004 
9 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .999 994 972 944 893 .651 401 .096 
ce n=15 
Pp 
0.01 0.05 0.10 0.20 0.25 0.30 0.40 0.50 0.60 0.70 0.75 0.80 0.90 0.95 0.99 
0 .860 .463 .206 .035 013 .0OS .000 .000 .000 .000 .000 .000 000 §=.000 = .000 
1 .990 829 549 .167 .080 .035 .005 .000 .000 000 .000 = .000 = .000 = .000 .000 
2 1.000 .964 .816 398 236 h27 .027 .004 .000 000 .000 = .000 = §=.000 = =.000 = .000 
3. 1.000 995 944 .648 461 297 091 018 .002. .000 .000 .000 000 §©.000 .000 
4 1.000 999 .987 836 .686 O15 ly .059 .009 001 .000 .000 000 .000 = .000 
5 1.000 1.000 998 .939 .852 .722 403 ASI .034 004 .001 .000 000 .000 .000 
6 1.000 1.000 1.000 .982 943 .869 .610 304 095 015 .004 001 .000 .000  .000 
x 7 1.000 1.000 1.000 .996 .983 950, .787 500 213 050 017 004 =.000) =©.000 =.000 
8 1.000 1.000 1.000 .999 .996 985 905 .696 390 = .131) 057) S018 )~=—.000)=—.000_~——-.000 
9 1.000 1.000 1.000 1.000 .999 .996 .966 .849 597 278 .148 .061 .002 .000  .000 
10 1.000 1.000 1.000 1.000 1.000 .999 991 941 783 485 .314 .164 013 .001 .000 
11 =1.000 1.000 1.000 1.000 1.000 1.000 .998 982 909.703 539.352, 056.005.0000 
12 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .996 973.873 .764 =.602. 184. Ss .036—.000 
13. 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 995 965 920 .833 451 .171 .010 
14 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .995 .987 .965 .794 .537  .140 
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(continued ) 


Appendix Tables A-3 


Table A.1 Cumulative Binomial Probabilities (cont.) BoEn, p) = y b(y; n, p) 
y=0 


d. n = 20 


0.01 0.05 0.10 0.20 0.25 0.30 040 0.50 0.60 0.70 0.75 0.80 0.90 0.95 0.99 


O 818 358 122 012 003.001 000 000 =.000 §=©.000 §=.000 §=©.000 §=.000 §=.000-~=.000 
1 983 = .736— 392.069 024 = .008 ~=—-.001 000 .000 .000 .000 .000 §=.000 .000 ~~ =.000 
2 999 925 677 ~~ .206 091 035.004. = .000-)=— 000 = 000 =—.000 3.000) )=—.000=.000~—=.000 
3. 1000 .984 867 411 225. 107 016 ~~ .001 000 .000 .000 .000 .000 .000 ~ .000 
4 1.000 .997 957  .630 AIS 238.051 006 = .000 .000 .000 .000 .000 .000 ~ .000 
5 1.000 1.000 .989 — .804 617 416 126 021 002, .000 =.000 §=.000 §=.000 =.000~ =.000 
6 1.000 1.000 .998  .913 786 608 = .250 =.058 ~=—.006— 000 =.000 3.000 = .000 =—.000~=—.000 
7 1.000 1.000 1.000 .968 898 .772 416 132 .021 001 .000 .000 .000 .000 § .000 
8 1.000 1.000 1.000  .990 959.887) 596.252 057) 005. «.001-=— 000) 000 =—.000~=—-.000 
9 1.000 1.000 1.000  .997 986 952 755 412 .128 017 .004 .001 .000 .000 § .000 
. 10 «1.000 1.000 1.000 .999 996 = =.983. 872-588 .245. 048.014 = 003. «000. = .000~—.000 
11 1.000 1.000 1.000 1.000 999 995 943748404113 041.010) 000 =—.000_~—-«.000 
12 1.000 1.000 1.000 1.000 1.000 .999 979 868 .584 .228 .102 .032 .000 .000  .000 
13. 1.000 1.000 1.000 1.000 1.000 1.000 .994 942 .750 .392 .214 .087 .002 .000  .000 
14 1.000 1.000 1.000 1.000 1.000 1.000 .998 .979 874 .584 .383 .196 011 .000  .000 
15 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .994 949 .762 585 .370 .043 .003 ~ .000 
16 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .999 984 .893 .775 .589 .133 .016 000 
17 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .996 .965 .909 .794 323 .075  .001 
18 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .999 .992 .976 .931 .608 .264 017 
19 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .999 .997 .988 .878 .642  .182 


(continued ) 
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A-4 Appendix Tables 


Table A.1. Cumulative Binomial Probabilities (cont.) Rene Ss b(y; 1, p) 
y=0 


en = 25 


0.01 0.05 0.10 0.20 025 030 040 0.50 0.60 0.70 0.75 0.80 0.90 0.95 0.99 


0 .778 QUI .072 .004 001 .000 .000 .000 .000 .000 .000 .000 .000 = §=.000 = .000 
1 974 .642 271 .027 .007 .002 .000 .000 .000 .000 .000 .000 .000 = §=.000~ §.000 
2 = .998 .873 537 098 .032 .009 .000 .000 .000 .000 .000 .000 .000 §=.000-~ = .000 
3 1.000 .966 .764 234 .096 .033 .002 .000 .000 .000 .000 .000 .000 = §=.000 ~~ §©.000 
4 1.000 993 .902 421 214 .090 .009 .000 .000 000 .000 .000 .000 = §=.000-~ = .000 
5 1.000 .999 .967 617 378 193 .029 .002 .000 8.000 = .000 .000 .000 = §=.000 ~~ =.000 
6 1.000 1.000 991 780 561 341 074 007 000 000 .000 .000 .000 = §=.000~ .000 
7 1.000 1.000 998 891 727 12 154 .022 001 .000 .000 .000 000 §=.000~ .000 
8 1.000 1.000 1.000 .953 851 .677 274 054 .004 .000 .000 .000 .000 §©.000 ~~ .000 
9 1.000 1.000 1.000 983 929 811 425 115 013 00 .000 .000 .000 = §=.000~ .000 
10 1.000 1.000 1.000 .994 .970 .902 586 212 034 = .002 000 .000 .000 §=.000 ~~ = .000 
11 1.000 1.000 1.000 998 .980 .956 732 345 .078 006 .001 .000 .000 .000 = .000 
x 12 1.000 1.000 1.000 1.000 997 983 846 500 154 017) =.003. ~.000) =3=.000- ~=—.000_~—.000 
13. 1.000 1.000 1.000 1.000 999 994 922 655 268 044 020 .002 .000 8.000 .000 
14 1.000 1.000 1.000 1.000 1.000 998 .966 .788 414 098 .030 .006 .000 .000 - .000 
15 1.000 1.000 1.000 1.000 1.000 1.000 987 885 575 189 071 017 .000 = =.000 = .000 
16 1.000 1.000 1.000 1.000 1.000 1.000 .996 .946 726 = =©.323.— «149.047, .000~3=—.000-~——.000 
17 1.000 1.000 1.000 1.000 1.000 1.000 .999 978 846 488 .273 .109 .002 .000 .000 
18 1.000 1.000 1.000 1.000 1.000 1.000 1.000 993 926 659 =~.439) S220) .009~=—-.000_~—-.000 
19 1.000 1.000 1.000 1.000 1.000 1.000 1.000 998 971 807. =©.622)—- 383s .033—s-«.001~—-«.000 
20 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 991 910 .786 «=.579 =.098 )~=— .007_~—«.000 
21 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 998 967 904 .766 .236 =©.034_~—-.000 
22 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 991 .968 .902 .463 .127  .002 
23 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 998 .993 .973 .729 358 .026 
24 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 .999 .996 .928 .723  .222 
. ; es x eo wy 

Table A.2 Cumulative Poisson Probabilities Faw=> 

y=0 : 

yp 
a | oD 3 4 5 6 7 38 9 1.0 

0 .905 819 741 .670 .607 549 A497 449 407 368 
1 995 .982 .963 .938 910 878 .844 .809 .772 .736 
2 1.000 .999 .996 .992 .986 977 .966 953 .937 .920 
x 3 1.000 1.000 .999 998 997 994 991 987 981 
4 1.000 1.000 1.000 .999 999 .998 .996 
5 1.000 1.000 1.000 999 
6 1.000 


(continued ) 
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Appendix Tables A-5 


Table A.2. Cumulative Poisson Probabilities (cont. ae Mp 

(ont) Pw) = >, — 

y=0 Y! 

pw 

2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 15.0 20.0 
0 .135 .050 018 .007 .002 .0O1 .000 .000 .000 .000 .000 
1 .406 199 .092 .040 .017 .007 .003 001 .000 .000 .0O0 
2 .677 423 238 25 .062 .030 014 .006 .003 .000 .000 
3 .857 .647 433 265 51 .082 .042 .021 .010 .000 .000 
4 947 815 .629 .440 285 .173 .100 .055 .029 .001 .000 
5 983 916 .785 .616 446 301 191 116 .067 .003 .000 
6 995 .966 889 .762 .606 450 313 .207 .130 .008 .000 
7 .999 .988 .949 .867 744 599 453 324 .220 .018 001 
8 1.000 .996 979 .932 .847 .729 593 456 333 .037 .002 
9 999 .992 .968 .916 .830 HAT 587 A458 .070 .005 
10 1.000 .997 .986 .957 901 .816 .706 583 118 O11 
11 .999 995 .980 947 888 .803 .697 185 021 
12 1.000 .998 991 .973 .936 .876 .792 268 .039 
13 .999 .996 .987 .966 .926 .864 363 .066 
14 1.000 999 .994 983 .959 917 .466 105 
15 999 998 .992 .978 951 568 7 
16 1.000 999 .996 .989 .973 .664 221 
17 1.000 .998 .995 .986 .749 297 
18 .999 998 993, 819 381 
19 1.000 .999 997 .875 .470 
20 1.000 998 O17 559 
21 .999 .947 .644 
22 1.000 .967 721 
23 981 .787 
24 .989 843 
25 .994 888 
26 .997 922 
27 998 948 
28 .999 .966 
29 1.000 .978 
30 987 
31 .992 
32 995 
33 .997 
34 .999 
35 999 


we 
N 
oy 
(=) 
S 
i=) 
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A-6 = Appendix Tables 


Table A.3 Standard Normal Curve Areas @O(z) = P(Z<2z) 


Standard normal density curve 


x 


Shaded area = P(z) 


0 z 
Zz 00 01 02 03 04 05 .06 .07 .08 09 
—3.4 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0002 
—3.3 .0005 .0005 .0005 .0004 .0004 .0004 .0004 .0004 .0004 .0003 
—3.2 .0007 .0007 .0006 .0006 .0006 .0006 .0006 .0005 .0005 .0005 
-3.1 .0010 .0009 .0009 .0009 .0008 .0008 .0008 .0008 .0007 .0007 
—3.0 .0013 .0013 .0013 .0012 .0012 0011 .0011 .0011 .0010 .0010 
-2.9 0019 0018 0017 0017 0016 0016 0015 0015 0014 0014 
-2.8 0026 0025 0024 0023 0023 0022 0021 0021 0020 0019 
—2.7 .0035 .0034 .0033 .0032 .0031 .0030 .0029 0028 .0027 .0026 
—2.6 .0047 .0045 0044 .0043 0041 .0040 .0039 .0038 .0037 .0036 
—2.5 .0062 .0060 0059 .0057 .0055 0054 .0052 0051 .0049 .0038 
—2.4 0082 .0080 .0078 .0075 .0073 0071 .0069 .0068 .0066 .0064 
—2.3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 0084 
—2.2 .0139 .0136 .0132 .0129 0125 .0122 0119 .0116 0113 .0110 
—2.1 .0179 .0174 .0170 .0166 .0162 .0158 .0154 .0150 .0146 .0143 
—2.0 .0228 .0222 .0217 .0212 .0207 .0202 .0197 .0192 .0188 .0183 
-1.9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233 
—1.8 .0359 .0352 0344 .0336 .0329 .0322 .0314 .0307 .0301 .0294 
—-1.7 .0446 .0436 .0427 .0418 .0409 .0401 .0392 .0384 .0375 .0367 
—1.6 0548 .0537 0526 0516 0505 0495 0485 .0475 0465 0455 
—-1.5 .0668 .0655 .0643 .0630 .0618 .0606 0594 0582 0571 0559 
-1.4 0808 0793 0778 0764 0749 0735 0722 0708 0694 0681 
-13 .0968 0951 0934 0918 .0901 0885 .0869 0853 .0838 0823 
—1.2 AT51 1131 1112 .1093 .1075 .1056 .1038 .1020 .1003 .0985 
-11 A357 .1335 1314 1292 1271 1251 .1230 1210 .1190 .1170 
-1.0 1587 .1562 1539 A315 1492 .1469 1446 1423 1401 .1379 
—0.9 1841 1814 .1788 .1762 .1736 A711 .1685 .1660 1635 1611 
—0.8 2119 .2090 .2061 .2033 .2005 1977 1949 .1922 1894 .1867 
—0.7 2420 .2389 2358 wo2] 2296 2266 2236 2206 2177 2148 
—0.6 .2743 .2709 .2676 2643 2611 2578 .2546 2514 2483 2451 
—0.5 3085 .3050 3015 2981 2946 2912 .2877 2843 .2810 2776 
—0.4 3446 3409 3372 3336 3300 3264 3228 3192 3156 3121 
—0.3 3821 3783 3745 3707 3669 3632 3594 3557 3520 3482 
—0.2 4207 4168 4129 4090 4052 4013 3974 3936 3897 3859 
—0.1 .4602 4562 4522 4483 4443 4404 4364 4325 4286 4247 
—0.0 5000 4960 4920 4880 4840 A801 4761 4721 .4681 4641 
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Table A.3 Standard Normal Curve Areas (cont.) 


Appendix Tables 


A-7 


O(z) = P(Z Sz) 


Zz 00 01 02 .03 .04 05 .06 07 .08 .09 
0.0 5000 5040 5080 5120 5160 5199 5239 279 5319 3359 
0.1 5398 5438 5478 5517 i997 5596 5636 5675 5714 5753 
0.2 5793 5832 5871 5910 5948 5987 .6026 .6064 .6103 6141 
0.3 .6179 6217 .6255 .6293 6331 .6368 .6406 6443 .6480 .6517 
0.4 6554 6591 .6628 .6664 .6700 .6736 6772 .6808 6844 .6879 
0.5 .6915 .6950 6985 7019 7054 .7088 7123 7157 .7190 .7224 
0.6 7257 7291 .7324 7357 7389 7422 7454 .7486 7517 .7549 
0.7 .7580 7611 .7642 1673 .7704 7734 .7764 .7794 .7823 7852 
0.8 7881 .7910 .7939 £1967 7995 8023 8051 .8078 .8106 8133 
0.9 8159 8186 8212 8238 8264 8289 8315 8340 8365 8389 
1.0 8413 8438 8461 8485 8508 8531 8554 8577 8599 8621 
1.1 8643 .8665 .8686 .8708 8729 8749 8770 .8790 8810 8830 
1.2 8849 8869 8888 8907 8925 8944 8962 8980 8997 9015 
1.3 9032 .9049 .9066 9082 .9099 9115 9131 9147 9162 9177 
1.4 9192 9207 9222 9236 9251 9265 9278 9292 9306 9319 
1.5 .9332 9345 9357 .9370 9382 9394 .9406 9418 9429 9441 
1.6 9452 9463 9474 9484 9495 9505 9515 9525 9535 9545 
1.7 9554 .9564 .9573 9582 9591 9599 .9608 .9616 9625 .9633 
1.8 9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706 
1.9 9713 9719 .9726 9732 .9738 9744 9750 .9756 9761 .9767 
2.0 9772 9778 .9783 9788 .9793 .9798 .9803 .9808 9812 9817 
2.1 9821 .9826 .9830 9834 .9838 .9842 .9846 9850 9854 9857 
2.2 .9861 .9864 .9868 9871 .9875 .9878 9881 9884 .9887 .9890 
2.3 .9893 .9896 .9898 9901 9904 .9906 9909 9911 9913 .9916 
2.4 9918 9920 9922 9925 9927 9929 9931 9932 9934 .9936 
2.5 .9938 9940 9941 9943 9945 .9946 9948 9949 9951 9952 
2.6 9953 9955 .9956 9957 9959 .9960 9961 .9962 9963 .9964 
2.7 .9965 .9966 .9967 .9968 .9969 .9970 9971 .9972 .9973 9974 
2.8 9974 9975 .9976 9977 9977 .9978 9979 .9979 9980 9981 
2.9 9981 9982 9982 9983 9984 9984 9985 9985 .9986 .9986 
3.0 9987 9987 .9987 9988 .9988 9989 9989 9989 9990 9990 
3.1 9990, 9991 9991 9991 9992 9992 9992 9992 9993 9993 
3.2 9993 9993 9994 9994 9994 9994 9994 9995 9995 9995 
3.3 9995 9995 9995 .9996 .9996 .9996 .9996 .9996 .9996 9997 
3.4 9997 9997 9997 9997 9997 9997 9997 9997 9997 .9998 
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A-8 = Appendix Tables 


Table A.4_ The Incomplete Gamma Function hee | bat ge 5, 
T(a)y . 

ye 1 2 3 4 5 6 7 8 9 10 
1 .632 264 .080 .019 .004 001 .000 .000 .000 .000 
2 865 594 323 143 053 O17 005 001 .000 .000 
3 .950 801 577 3393 185 084 .034 012 .004 001 
4 .982 908 .762 567 371 i215 LL O51 021 .008 
5 .993 .960 .875 .735 560 384 238 133 .068 .032 
6 998 .983 .938 849 715 554 394 .256 153 .084 
ii .999 .993 .970 918 827 .699 550 401 271 .170 
8 1.000 997 .986 958 .900 .809 .687 547 407 .283 
9 999 994 .979 945 884 793 .676 544 413 
10 1.000 997 .990 971 933 870 .780 667 542 
11 .999 995 985, .962 921 857 -768 659 
12 1.000 998 .992 .980 954 911 845 758 
13 .999 .996 .989 974 946 .900 834 
14 1.000 998 994 .986 .968 .938 891 
15 .999 997 .992 .982 .963 .930 
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Appendix Tables A-9 


Table A.5 Critical Values for t Distributions t, density curve 


Shaded area = a 


0 toy 
Qa 

"\ 10 05 .025 01 .005 -001 -0005 

1 3.078 6.314 12.706 31.821 63.657 318.31 636.62 
2 1.886 2.920 4.303 6.965 9.925 22.326 31.598 
3 1.638 2.353 3.182 4.541 5.841 10.213 12.924 
4 1.533 2.132 2.776 3.747 4.604 7.173 8.610 
5 1.476 2.015 2.571 3.365 4.032 5.893 6.869 
6 1.440 1.943 2.447 3.143 3.707 5.208 5.959 
7 1.415 1.895 2.365 2.998 3.499 4.785 5.408 
8 1.397 1.860 2.306 2.896 3.355 4.501 5.041 
9 1.383 1.833 2.262 2.821 3.250 4.297 4.781 
10 1.372 1.812 2.228 2.764 3.169 4.144 4.587 
11 1.363 1.796 2.201 2.718 3.106 4.025 4.437 
12 1.356 1.782 2.179 2.681 3.055 3.930 4.318 
13 1.350 1.771 2.160 2.650 3.012 3.852 4.221 
14 1.345 1.761 2.145 2.624 2.977 3.787 4.140 
15 1.341 1.753 2.131 2.602 2.947 3.733 4.073 
16 1.337 1.746 2.120 2.583 2.921 3.686 4.015 
17 1.333 1.740 2.110 2.567 2.898 3.646 3.965 
18 1.330 1.734 2.101 2552 2.878 3.610 3.922 
19 1.328 1.729 2.093 2.539 2.861 3.579 3.883 
20 1.325 1.725 2.086 2.528 2.845 3.552 3.850 
21 1.323 1.721 2.080 2.518 2.831 3.527 3.819 
22 1.321 1.717 2.074 2.508 2.819 3.505 3.792 
23 1.319 1.714 2.069 2.500 2.807 3.485 3.767 
24 1.318 1.711 2.064 2.492 219) 3.467 3.745 
25 1.316 1.708 2.060 2.485 2.787 3.450 3.725 
26 1.315 1.706 2.056 2.479 2.779 3.435 3.707 
27 1.314 1.703 2.052 2.473 2.771 3.421 3.690 
28 1.313 1.701 2.048 2.467 2.763 3.408 3.674 
29 1.311 1.699 2.045 2.462 2.756 3.396 3.659 
30 1.310 1.697 2.042 2.457 2.750 3.385 3.646 
32 1.309 1.694 2.037 2.449 2.738 3.365 3.622 
34 1.307 1.691 2.032 2.441 2.728 3.348 3.601 
36 1.306 1.688 2.028 2.434 2.719 3.333 3.582 
38 1.304 1.686 2.024 2.429 2.712 3.319 3.566 
40 1.303 1.684 2.021 2.423 2.704 3.307 3.951 
50 1.299 1.676 2.009 2.403 2.678 3.262 3.496 
60 1.296 1.671 2.000 2.390 2.660 3.232 3.460 
120 1.289 1.658 1.980 2.358 2.617 3.160 3.373 


00 1.282 1.645 1.960 2.326 2.576 3.090 3.291 
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Appendix Tables A-11 


Table A.7 Critical Values for Chi-Squared Distributions x2 density curve 


Shaded area = a 


0 2 
i Xav 
a 

v 995 99 975 95 90 10 05 025 O01 005 

1 0.000 0.000 0.001 0.004 0.016 2.706 3.843 5.025 6.637 7.882 

2 0.010 0.020 0.051 0.103 0.211 4.605 5.992 7.378 9.210 10.597 

3 0.072 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.344 12.837 

4 0.207 0.297 0.484 0.711 1.064 7.779 9.488 11.143 13.277 14.860 

5 0.412 0.554 0.831 1.145 1.610 9.236 11.070 12.832 15.085 16.748 

6 0.676 0.872 1.237 1.635 2.204 10.645 12.592 14.440 16.812 18.548 

7 0.989 1.239 1.690 2.167 2.833 12.017 14.067 16.012 18.474 20.276 

8 1.344 1.646 2.180 2.733 3.490 13.362 15.507 17.534 20.090 21.954 

9 1.735 2.088 2.700 3.325 4.168 14.684 16.919 19.022 21.665 23.587 
10 2.156 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209 25.188 
11 2.603 3.053 3.816 4.575 5.578 17.275 19.675 21.920 24.724 26.755 
12 3.074 3.5/1 4.404 5.226 6.304 18.549 21.026 23331 26.217 28.300 
13 3.565 4.107 5.009 5.892 7.041 19.812 22.362 24.735 27.687 29.817 
14 4.075 4.660 5.629 6.571 7.790 21.064 23.685 26.119 29.141 31.319 
15 4.600 5.229 6.262 7.261 8.547 22.307 24.996 27.488 30.577 32.799 
16 5.142 5.812 6.908 7.962 9.312 23.542 26.296 28.845 32.000 34.267 
17 5.697 6.407 7.564 8.682 10.085 24.769 27.587 30.190 33.408 35.716 
18 6.265 7.015 8.231 9.390 10.865 25.989 28.869 31.526 34.805 37.156 
19 6.843 7.632 8.906 10.117 11.651 27.203 30.143 32.852 36.190 38.580 
20 7.434 8.260 9.591 10.851 12.443 28.412 31.410 34.170 37.566 39.997 
21 8.033 8.897 10.283 11.591 13.240 29.615 32.670 35.478 38.930 41.399 
22 8.643 9.542 10.982 12.338 14.042 30.813 33.924 36.781 40.289 42.796 
23 9.260 10.195 11.688 13.090 14.848 32.007 35:172 38.075 41.637 44.179 
24 9.886 10.856 12.401 13.848 15.659 33.196 36.415 39.364 42.980 45.558 


25 10.519 11523 13.120 14.611 16.473 34.381 37.652 40.646 44.313 46.925 


26 11.160 12.198 13.844 15.379 17.292 35.563 38.885 41.923 45.642 48.290 
27 11.807 12.878 14.573 16.151 18.114 36.741 40.113 43.194 46.962 49.642 
28 12.461 13.565 15.308 16.928 18.939 37.916 41.337 44.461 48.278 50.993 
29 13.120 14.256 16.147 17.708 19.768 39.087 42.557 45.772 49.586 52.333 
30 13.787 14.954 16.791 18.493 20.599 40.256 43.773 46.979 50.892 53.672 


31 14.457 15.655 17.538 19.280 21.433 41.422 44.985 48.231 52.190 55.000 
32 15.134 16.362 18.291 20.072 22.271 42.585 46.194 49.480 53.486 56.328 
33 15.814 17.073 19.046 20.866 23.110 43.745 47.400 50.724 54.774 57.646 
34 16.501 17.789 19.806 21.664 23.952 44.903 48.602 51.966 56.061 58.964 
35 17.191 18.508 20.569 22.465 24.796 46.059 49.802 53.203 57.340 60.272 


36 17.887 19.233 21.336 23.269 25.643 47.212 50.998 54.437 58.619 61.581 
37 18.584 19.960 22.105 24.075 26.492 48.363 52.192 55.667 59.891 62.880 
38 19.289 20.691 22.878 24.884 27.343 49.513 53.384 56.896 61.162 64.181 
39 19.994 21.425 23.654 25.695 28.196 50.660 54.572 58.119 62.426 65.473 
40 20.706 22.164 24.433 26.509 29.050 51.805 55.758 59.342 63.691 66.766 


2 2\ 
For v > 40, x2, ~ yl-—+z, 4/— 
/ ov Ov 
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Table A.8 ft Curve Tail Areas 


t curve Area to the 
oat t 
—_— TE 


0 
t 


yp il 2 3 4 5 6 7 8 9 10 lt 12 13. «14 15 16 «6©17—~— «18 


0.0 | 500 500 500 500 500 500 500 500 500 500 500 500 500 500 500 500 .500 .500 
0.1 | 468 465 463 463 462. 462 462 461 461 461 461 461 461 461 461 461 461 461 
0.2 | 437 430 .427 426 425 424 424 423 423 423 423 422 422 422 422 422 422 422 
0.3.) 407 .396 .392 .390 .388 .387 386 .386 .386 .385 .385 .385 .384 384 = .384 «384.384 384 
0.4 | 379 364 .358 .355 .353 352 351) = .350 349.349) 348) 348 «348 347) 347) 347) 1347 347 
0.5 | .352  .333 .326 .322 319 317 316 .315 315 .314 .313) 313) 313) 312) 312) 312) 312.312 


0.6 | 328 .305 .295 .290 .287 .285 .284 .283 .282 .281 .280 .280 .279 .279 .279 .278 .278 .278 
0.7 | 306 .278 .267 .261 .258 .255 .253) .252 251 .250 .249 249 248 1247 1247) «247 «247246 
0.8 | .285 .254 .241 .234 .230 .227 225) .223 222) 221 .220 .220 .219 .218 .218 218 .217 217 
0.9 | .267 .232 .217 .210 .205 .201 .199 197 196 195 .194 .193 .192 .191 .191 191 .190 .190 
1.0 | 250 .211 .196 187 .182 .178 .175 .173 172 170 .169 .169 .168 .167 .167 .166 .166 .165 


Lio) 235 193 .176 167) 1162) 157.154.152.150 .149' 147.146 146) 144. 144.144.143.143 
12) 221) 177) 158) 148) 142) 138) 135) 132) 130) 129) 128) 127) 126.124.124.124) .123 123 
13°) .209 162 142 132 125 121 117) 115) 113) 111) 110.109.108.107) 107.106.105.105. 
14 ) 197) 148 .128 117) .110 .106 .102 .100 .098 .096 .095 .093 .092 .091 .091 .090 .090 .089 
15 | 187 136 .115 104 .097 .092 .089 .086 .084 .082 .081 .080 .079 .077 077. .077 .076 .075 


16 | 178 125 .104 .092 085 .080 .077 .074 .072 .070 .069 .068 .067 .065 .065 .065 .064 .064 
1.7 | 169 .116 .094 .082 075 .070 .065 .064 .062 .060 .059 .057 .056 .055 .055 .054 .054 .053 
18 | 161 .107 .085 .073 .066 .061 .057 .055 .053) .051 .050 .049 .048 .046 .046 .045 .045 .044 
1.9 | 154 .099 077 .065 .058 .053 .050 .047 .045 .043 .042 .041 .040 .038 .038 .038 .037 .037 
2.0 | .148 .092 .070 .058 .051 .046 .043 .040 .038 .037 .035 .034 .033 .032 .032 .031 .031 .030 


2.1 | 141 .085 .063 .052 .045 .040 .037 .034 .033 .031 .030 .029 .028 .027 .027 .026 .025 .025 
2.2 | .136 .079 .058 .046 .040 .035 .032 .029 .028 .026 .025 .024 .023 .022 .022 .021 .021 .021 
2.3 | 131 .074 .052 .041 .035 .031 .027) .025 .023 .022 .021 .020 .019 .018 .018 018 .017 017 
2.4 | .126 .069 .048 .037  .031 .027 .024 .022 .020 .019 018 017 016 015 015 014 .014 .014 
2.55 | 121 .065 .044 .033 .027 .023 .020 .018 017 016 015 .014 013 012 .012 012 O11 O11 


2.6 | .117  .061 .040 .030 .024 .020 .018 .016 .014 013 012 .012 011 .010 .010 010 .009 .009 
2.7 | 113. 057 .037 027) 021 «=.018 015) «014 012) 011 010 .010 .009 .008 .008 .008 .008 .007 
2.8 | .109 .054 .034 .024 .019 .016 013) .012 .010 .009 .009 .008 .008 .007 .007 .006 .006 .006 
2.9 | 106 .051 .031 .022 017 .014 O11 .010 .009 .008 .007 .007 .006 .005 .005 .005 .005 .005 
3.0 | .102 .048 .029 .020 015 .012 .010 .009 .007 .007 .006 .006 .005 .004 .004 .004 .004 .004 


3.1 | 099 045 .027 .018 013 011 .009 .007 .006 .006 .005 .005 .004 .004 .004 .003 .003 .003 
3.2 | .096 .043 .025 .016 .012 .009 .008 .006 .005 .005 .004 .004 .003 .003 .003 .003 .003 .002 
3.3.) .094 .040 .023 .015 O11 .008 .007 .005 .005 .004 .004 .003 .003 .002 .002 .002 .002 .002 
3.4 | .091 .038 .021 .014 010 .007 .006 .005 .004 .003 .003 .003 .002 .002 .002 .002 .002 .002 
3.5 | .089 .036 .020 .012 .009 .006 .005 .004 .003 .003 .002 .002 .002 .002 .002 .001 .001 .001 


3.6 | .086 .035 .018 O11 .008 .006 .004 .004 .003 .002 .002 .002 .002 .001 .001 .001 .001 .001 
3.7 | .084 .033 .017 .010 .007 .005 .004 .003 .002 .002 .002 .002 .001 .001 .001 .001 .001 .001 
3.8 | .082 .031 .016 .010 .006 .004 .003 .003 .002 .002 .001 .001 .001 .001 .001 .001 .001 .001 
3.9 | .080 .030 .015 .009 .006 .004 .003 .002 .002 .001 .001 .001 .001 .001 .001 .001 .001 .001 
40 | .078 .029 .014 .008 .005 .004 .003 .002 .002 .001 .001 .001 .001 001 .001 .001 .000 .000 


(continued ) 
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Table A.8_ t Curve Tail Areas (cont.) 


t curve Area to the 
a t 
—_— Te 


y 19 20 21 22 23 «24 25 «26 27 «+28 = 29 30 350 40 60 120. wm (=z) 


0.0 | 500 500 500 500 500 500 500 500 500 500 500 500 500 500 500.500 .500 
0.1 | 461 461 461 461 461 461 461 461 461 461 461 461 460 460 460 460 460 
0.2 | 422 422 422 422 422 422 422 422 421 421 421 421 421 421 421 421 421 
0.3 | 384 384 .384 .383 .383 383.383) 383.383) 383) 383.383) 383.383.383.382 .382 
0.4 | 347 347 .347 347.346) «346 «6.346 «346 «3.346 «346 «346 «346 «346 «346 «345345 345 
05 | 311 311 311 311 311) 311) 311) 311) 311) 310) =.310) =.310 =.310) =.310)=.309 309 309 


0.6 | .278 .278) 278 277) 277) 277) 277) 277) 277) 277) 277277 276) 276 275.275 274 
0.7 | .246 .246 246 .246 .245 245 245 245 245 245 245 245 244 244 243 243 242 
0.8 | .217 217) 216 216 216) «=.216 «216 «215 215) 215) 215) 215) 215) 214 213.213 212 
0.9 | .190 .189 .189 .189 .189 .189 .188 .188 .188 .188 .188 .188 .187 .187 .186  .185 .184 
1.0 | 165 165 .164 .164 .164 .164 .163 .163 .163 163 .163 .163 .162 .162 .161 .160 159 


V1} 143 142) 142 142 141 1141 4.141) 141) 141) 140) 140) =.140) 139) 139.138) 137 .136 
V2 | 122 122) 122 121 121) 121) 121) 120) 120) «=.120) =.120 »=.120 «£119 =.119 117s «116 115 
13°} 105 .104 .104 .104 .103 .103 .103 .103 .102 .102 .102 .102 .101 .101 .099 .098 .097 
1.4 | .089 .089 .088 .088 .087 .087 .087 .087 .086 .086 .086 .086 .085 .085 .083 .082 081 
1.5 | 075 075 .074 .074 .074 .073 073 073.073 072 .072 .072 .071 .071 .069 .068 .067 


16 | 063 .063 .062 .062 .062 .061 .061 .061 .061 .060 .060 .060 .059 .059 .057 .056 055, 
1.7 | 053.052) 052.052) 051) =.051) 051) = .051_) =.050) S050 3.050 =~.050 =~.049 048 =—.047 046 045 
1.8 | 044 043 .043 043 .042 .042 .042 042 .042 041 .041 .041 .040 .040 .038 .037 .036 
1.9 | .036 .036 .036 .035 .035 .035 .035 .034 .034 .034 .034 .034 .033 .032 .031 .030 029 
2.0 | .030 .030 .029 .029 .029 .028 .028 .028 .028 .028 .027 .027 .027 .026 .025 .024 023 


2.1 | .025 .024 024 .024 .023 .023 .023 .023 .023 022 022 .022 .022 .021 .020 .019 018 
2.2 | .020 .020 .020 .019 .019 019 019 .018 .018 018 .018 018 017 017° 016 .015 014 
2.3) .016 .016 016 016 015 015 015 015 015 O15 014 014 014 013 012 .012 O11 
2.4 | 013 013) 013) 013) 012) 012) 012) 012) 012) 012) 012s O11) =.011 =.011)=.010 = .009 .008 
2.55 | 011 011 .010 .010 .010 010 .010 .010 .009 .009 .009 .009 .009 .008 .008 .007 .006 


2.6 | .009 .009 .008 .008 .008 .008 .008 .008 .007 .007 .007 .007 .007 .007 .006  .005 005 
2.7 | .007 .007 .007 .007 .006 .006 .006 .006 .006 .006 .006 .006 .005 .005 .004 .004 003 
2.8 | .006 .006 .005 .005 .005 .005 .005 .005 .005 .005 .005 .004 .004 .004 .003  .003 .003 
2.9 | 005 .004 .004 .004 .004 .004 .004 .004 .004 .004 .004 .003 .003 .003 .003 .002 002 
3.0 | .004 .004 .003 .003 .003 .003 .003 .003 .003 .003 .003 .003 .002 .002 .002 .002 001 


3.1 | .003 .003 .003 .003 .003 .002 .002 .002 .002 .002 .002 .002 .002 .002 .001 .001 001 
3.2 | .002 .002 .002 .002 .002 .002 .002 .002 .002 .002 .002 .002 .001 .001 001 .001 001 
3.3. | .002 .002 .002 .002 .002 .001 .001 .001 .001 001 .001 .001 .001 .001 001 .001 .000 
3.4 | .002 .001 .001 .001 .001 .001 .001 .001 .001 001 .001 .001 .001 .001 .001 .000 .000 
3.5 | .001 .001 .001 .001 .001 .001 .001 .001 .001 001 .001 .001 .001 .001 .000 .000 000 


3.6 | .001 .001 .001 .001 .001 .001 .001 .001 .001 .001 .001 .001 .000 .000 .000 .000 .000 
3.7 | .0O1 .001 .001 .001 .001 .001 .001 .001 .000 .000 .000 .000 .000 .000 .000 .000 .000 
3.8 | .001 .001 .001 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 000 
3.9 | .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 000 
40 | .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 000 
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A-14 Appendix Tables 


Table A.9 = Critical Values for F Distributions 


v, = numerator df 

a 1 2 3 4 5 6 7 8 9 
-100 39.86 49.50 53.59 55.83 57.24 58.20 58.91 59.44 59.86 
1 050 161.45 199.50 215.71 224.58 230.16 233.99 236.77 238.88 240.54 
010) =| 4052.20, = 4999.50 = 5403.40 = 5624.60 = 5763.60 5859.00 5928.40 5981.10 6022.50 
001 =| 405,284 500,000 540,379 562,500 576,405 585,937 592,873 598,144 602,284 
.100 8.53 9.00 9.16 9.24 9.29 9.33 9.35 9.37 9.38 
2 050 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 
010 98.50 99.00 99.17 99.25 99.30 99.33 99.36 99.37 99.39 
001 998.50 999.00 999.17 999.25 999.30 999.33 999.36 999.37 999.39 
-100 5.54 5.46 3:39 5.34 5.31 5.28 5.27 5.25 5.24 
3 050 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 
010 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.35 
001 167.03 148.50 141.11 137.10 134.58 132.85 131.58 130.62 129.86 
100 4.54 4.32 4.19 4.11 4.05 4.01 3.98 3.95 3.94 
4 050 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 
010 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 
001 74.14 61.25 56.18 53.44 51.71 50.53 49.66 49.00 48.47 
-100 4.06 3.78 3.62 3.52 3.45 3.40 3.37 3.34 3.32 
5 .050 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 
.010 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 
001 47.18 37.12 33.20 31.09 29.75 28.83 28.16 27.65 27.24 
ss 100 3.78 3.46 3.29 3.18 3,11 3.05 3.01 2.98 2.96 
é 6 .050 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 
g 010 13.75 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 
= 001 35.51 27.00 23.70 21.92 20.80 20.03 19.46 19.03 18.69 

S 

5 -100 3.59 3.26 3.07 2.96 2.88 2.83 2.78 2.75 2.72 
i 7 .050 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 
= .010 12.25 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 
001 29.25 21.69 18.77 17.20 16.21 15.52 15.02 14.63 14.33 
100 3.46 3.11 2.92 2.81 243 2.67 2.62 2.59 2.56 
8 050 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 
010 11.26 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 
001 25.41 18.49 15.83 14.39 13.48 12.86 12.40 12.05 11.77 
-100 3.36 3.01 2.81 2.69 2.61 2:9 2.51 2.47 2.44 
9 .050 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 
.010 10.56 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 
001 22.86 16.39 13.90 12.56 11.71 11.13 10.70 10.37 10.11 
100 3.29 2.92 243 2.61 2.52 2.46 2.41 2.38 2.35 
10 050 4.96 4.10 3:71 3.48 3.33 3.22 3.14 3.07 3.02 
.010 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 
001 21.04 14.91 12.55 11.28 10.48 9.93 9.52 9.20 8.96 
.100 3.23 2.86 2.66 2.54 2.45 2.39 2.34 2.30 2.27 
uU 050 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 
.010 9.65 7.21 6.22 5.67 3.32 5.07 4.89 4.74 4.63 
.001 19.69 13.81 11.56 10.35 9.58 9.05 8.66 8.35, 8.12 
100 3.18 2.81 2.61 2.48 2.39 2.33 2.28 2.24 2.21 
D 050 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 
010 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 
001 18.64 12.97 10.80 9.63 8.89 8.38 8.00 7.71 7.48 
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Appendix Tables A-15 


Table A.9 Critical Values for F Distributions (cont.) 


v, = numerator df 


10 12 15 20 25 30 40 50 60 120 1000 


60.19 60.71 61.22 61.74 62.05 62.26 62.53 62.69 62.79 63.06 63.30 
241.88 243.91 245.95 248.01 249.26 250.10 251.14 251.77 252.20 253.25 254.19 
6055.80 6106.30 6157.30 6208.70 6239.80 6260.60 6286.80 6302.50 6313.00 6339.40 6362.70 
605,621 610,668 615,764 620,908 624,017 626,099 628,712 630,285 631,337 633,972 636,301 


9.39 9.41 9.42 9.44 9.45 9.46 9.47 9.47 9.47 9.48 9.49 
19.40 19.41 19.43 19.45 19.46 19.46 19.47 19.48 19.48 19.49 19.49 
99.40 99.42 99.43 99.45 99.46 99.47 99.47 99.48 99.48 99.49 99.50 

999.40 999.42 999.43 999.45 999.46 999.47 999.47 999.48 999.48 999.49 999.50 

5.23 5.22 5.20 5.18 3.17 Shy 5.16 3.15 5.5 5.14 5.13 

8.79 8.74 8.70 8.66 8.63 8.62 8.59 8.58 8.57 8.55 8.53 
27.23 27.05 26.87 26.69 26.58 26.50 26.41 26.35 26.32 26.22 26.14 

129.25 128.32 127.37 126.42 125.84 125.45 124.96 124.66 124.47 123.97 123.53 

3.92 3.90 3.87 3.84 3.83 3.82 3.80 3.80 3.79 3.78 3.76 

5.96 5.91 5.86 5.80 5.77 afD 312 5.70 5.69 5.66 5.63 
14.55 14.37 14.20 14.02 13.91 13.84 13.75 13.69 13.65 13.56 13.47 
48.05 47.41 46.76 46.10 45.70 45.43 45.09 44.88 44.75 44.40 44.09 

3.30 3.27 3.24 3.21 3.19 3.17 3.16 3.15 3.14 3.12 3.11 

4.74 4.68 4.62 4.56 4.52 4.50 4.46 4.44 4.43 4.40 4.37 
10.05 9.89 9.72 9.55 9.45 9.38 9.29 9.24 9.20 9.11 9.03 
26.92 26.42 25.91 25.39 25.08 24.87 24.60 24.44 24.33 24.06 23.82 

2.94 2.90 2.87 2.84 2.81 2.80 2.78 217 2.76 2.74 2.72 

4.06 4.00 3.94 3.87 3.83 3.81 S17 She 3.74 3.70 3.67 

7.87 TA2 7.56 7.40 7.30 had 7.14 7.09 7.06 6.97 6.89 
18.41 17.99 17.56 17.12 16.85 16.67 16.44 16.31 16.21 15.98 15.77 

2.70 2.67 2.63 2.59 2.57 2.56 2.54 2.52 2.51 2.49 2.47 

3.64 37 3:51 3.44 3.40 3.38 3.34 3.32 3.30 3.21 3:23 

6.62 6.47 6.31 6.16 6.06 5.99 5.91 5.86 5.82 5.74 5.66 
14.08 13.71 13.32 12.93 12.69 12.53 12.33 12.20 12.12 11.91 11.72 

2.54 2.50 2.46 2.42 2.40 2.38 2.36 2.35 2.34 2.32 2.30 

3.35 3.28 3.22 3.15 3.11 3.08 3.04 3.02 3.01 2.97 2.93 

5.81 5.67 5.52 5.36 5.26 5.20 5.12 5.07 5.03 4.95 4.87 
11.54 11.19 10.84 10.48 10.26 10.11 9.92 9.80 9.73 9.53 9.36 

2.42 2.38 2.34 2.30 2.27 2.25 2.23 2.22 2.21 2.18 2.16 

3.14 3.07 3.01 2.94 2.89 2.86 2.83 2.80 249 215 2.71 

5.26 5.1 4.96 4.81 4.71 4.65 4.57 4.52 4.48 4.40 4.32 

9.89 9.57 9.24 8.90 8.69 8.55 8.37 8.26 8.19 8.00 7.84 

2.32 2.28 2.24 2.20 217 2.16 2.13 2.12 2.11 2.08 2.06 

2.98 2.91 2.85 2.77 2.73 2.70 2.66 2.64 2.62 2.58 2.54 

4.85 4.71 4.56 441 4.31 4.25 4.17 4.12 4.08 4.00 3.92 

8.75 8.45 8.13 7.80 7.60 7.47 7.30 LAD Fad 6.94 6.78 

2.25 2.21 2.17 2.12 2.10 2.08 2.05 2.04 2.03 2.00 1.98 

2.85 2.79 2.12 2.65 2.60 2.57 2.53 2.51 2.49 2.45 2.41 

4.54 4.40 4.25 4.10 4.01 3.94 3.86 3.81 3.78 3.69 3.61 

7.92 7.63 132 7.01 6.81 6.68 6.52 6.42 6.35 6.18 6.02 

2.19 2:15 2.10 2.06 2.03 2.01 1.99 1.97 1.96 1.93 1.91 

2.75 2.69 2.62 2.54 2.50 2.47 2.43 2.40 2.38 2.34 2.30 

4.30 4.16 4.01 3.86 3.76 3.70 3.62 3.57 3.54 3.45 3:31 

7.29 7.00 6.71 6.40 6.22 6.09 5.93 5.83 5.76 5.59 5.44 
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A-16 Appendix Tables 


Table A.9 = Critical Values for F Distributions (cont.) 


v, = numerator df 

a 1 2 3 4 5 6 7 8 9 
100 3.14 2.76 2.56 2.43 2:39 2.28 2.23 2.20 2.16 
B 050 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2:11 
010 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 
001 17.82 12.31 10.21 9.07 8.35 7.86 7.49 7.21 6.98 
100 3.10 2.73 ZOD 2.39 2.31 2.24 2.19 2.15 2.12 
14 050 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 
010 8.86 6.51 5.56 5.04 4.69 4.46 4.28 4.14 4.03 
001 17.14 11.78 9.73 8.62 7.92 7.44 7.08 6.80 6.58 
100 3.07 2.70 2.49 2.36 22] 221 2.16 2.12 2.09 
15 050 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 
010 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 
001 16.59 11.34 9.34 8.25 pay 7.09 6.74 6.47 6.26 
100 3.05 2.67 2.46 2.33 2.24 2.18 2.13 2.09 2.06 
16 050 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 
010 8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 
001 16.12 10.97 9.01 7.94 7.27 6.80 6.46 6.19 5.98 
100 3.03 2.64 2.44 2.31 2.22 2.15 2.10 2.06 2.03 
7 050 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55, 2.49 
010 8.40 6.11 319 4.67 4.34 4.10 3.93 3.79 3.68 
001 15.72 10.66 8.73 7.68 7.02 6.56 6.22 5.96 5.75 
x 100 3.01 2.62 2.42 2.29 2.20 2.13 2.08 2.04 2.00 
é 18 050 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 
g 010 8.29 6.01 5.09 4.58 4.25 4.01 3.84 3.71 3.60 
= 001 15.38 10.39 8.49 7.46 6.81 6.35 6.02 5.76 5.56 
5 100 2.99 2.61 2.40 2.27 2.18 2.11 2.06 2.02 1.98 
7 19 050 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 
= 010 8.18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.92 
001 15.08 10.16 8.28 7.27 6.62 6.18 5.85 5.59 5.39 
100 2.97 2.59 2.38 2.25 2.16 2.09 2.04 2.00 1.96 
20 .050 4.35 3.49 3.10 2.87 271 2.60 2.51 2.45 2.39 
010 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 
001 14.82 9.95 8.10 7.10 6.46 6.02 5.69 5.44 5.24 
100 2.96 2.57 2.36 2.23 2.14 2.08 2.02 1.98 1.95 
4 050 4.32 3.47 3.07 2.84 2.68 ZO] 2.49 2.42 2.37 
010 8.02 5.78 4.87 4.37 4.04 3.81 3.64 351 3.40 
001 14.59 9.77 7.94 6.95 6.32 5.88 5.56 5.31 5.11 
100 2.95 2.56 2.35 2.22 2,13 2.06 2.01 1.97 1.93 
»? 050 4.30 3.44 3.05 2.82 2.66 255 2.46 2.40 2.34 
.010 7.95 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 
001 14.38 9.61 7.80 6.81 6.19 5.76 5.44 5.19 4.99 
100 2.94 2.55 2.34 2.21 2.11 2.05 1.99 1.95 1.92 
3B .050 4.28 3.42 3.03 2.80 2.64 253 2.44 234 2.32 
010 7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 
.001 14.20 9.47 7.67 6.70 6.08 5.65 5:33 5.09 4.89 
100 2.93 2.54 2.33 2.19 2.10 2.04 1.98 1.94 1.91 
24 .050 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 
.010 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 
.001 14.03 9.34 7.55 6.59 5.98 5.55 5.23 4.99 4.80 
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Appendix Tables A-17 


Table A.9 = Critical Values for F Distributions (cont.) 


v, = numerator df 


10 12 15 20 25 30 40 50 60 120 1000 
2.14 2.10 2.05 2.01 1.98 1.96 1.93 1.92 1.90 1.88 1.85 
2.67 2.60 2.53 2.46 2.41 2.38 2.34 2.31 2.30 2.25 2.21 
4.10 3.96 3.82 3.66 3.57 3.51 3.43 3.38 3.34 3:23 3.18 
6.80 6.52 6.23 5.93 D7) 5.63 5.47 3:37 5.30 5.14 4.99 
2.10 2.05 2.01 1.96 1.93 1.91 1.89 1.87 1.86 1.83 1.80 
2.60 2.53 2.46 2.39 2.34 2.31 2.27 2.24 2.22 2.18 2.14 
3.94 3.80 3.66 3.51 3.41 3.35 3.27 3.22 3.18 3.09 3.02 
6.40 6.13 5.85 5.56 5.38 3:29 5.10 5.00 4.94 4.77 4.62 
2.06 2.02 1.97 1.92 1.89 1.87 1.85 1.83 1.82 1.79 1.76 
2.54 2.48 2.40 2.33 2.28 2.25 2.20 2.18 2.16 2.11 2.07 
3.80 3.67 3.52 3.37 3.28 3.21 3:13 3.08 3.05 2.96 2.88 
6.08 5.81 5.54 3:29 5.07 4.95 4.80 4.70 4.64 4.47 4.33 
2.03 1.99 1.94 1.89 1.86 1.84 1.81 1.79 1.78 1.75 ld 
2.49 2.42 2.35 2.28 2.23 2.19 215 2.12 2.11 2.06 2.02 
3.69 3:59 3.41 3.26 3.16 3.10 3.02 2.97 2.93 2.84 2.76 
5.81 95 5.27 4.99 4.82 4.70 4.54 4.45 4.39 4.23 4.08 
2.00 1.96 1.91 1.86 1.83 1.81 1.78 1.76 1.75 1.72 1.69 
2.45 2.38 2.31 2.23 2.18 2.15 2.10 2.08 2.06 2.01 1.97 
3.59 3.46 3.31 3.16 3.07 3.00 2.92 2.87 2.83 2.75 2.66 
5.58 D332 5.05 4.78 4.60 4.48 4.33 4.24 4.18 4.02 3.87 
1.98 1:93. 1.89 1.84 1.80 1.78 1.75 1.74 1.72 1.69 1.66 
2.41 2.34 2.21 2.19 2.14 211 2.06 2.04 2.02 1.97 1.92 
3:51 3.37 3.23 3.08 2.98 2.92 2.84 2.78 29 2.66 2.58 
5.39 5.13 4.87 4.59 4.42 4.30 4.15 4.06 4.00 3.84 3.69 
1.96 1.91 1.86 1.81 1.78 1.76 1.73 1.71 1.70 1.67 1.64 
2.38 2.31 2.23 2.16 2.11 2.07 2.03 2.00 1.98 1:93 1.88 
3.43 3.30 3.15 3.00 2.91 2.84 2.76 2.71 2.67 2.58 2.50 
5.22 4.97 4.70 4.43 4.26 4.14 3.99 3.90 3.84 3.68 3.53 
1.94 1.89 1.84 1.79 1.76 1.74 171 1.69 1.68 1.64 1.61 
2.35 2.28 2.20 2.12 2.07 2.04 1.99 1.97 1.95 1.90 1.85 
337 3.23 3.09 2.94 2.84 2.78 2.69 2.64 2.61 2.52 2.43 
5.08 4.82 4.56 4.29 4.12 4.00 3.86 3.77 3.70 3.54 3.40 
1.92 1.87 1.83 1.78 1.74 1.72 1.69 1.67 1.66 1.62 1.59 
2.32 2.25 2.18 2.10 2.05 2.01 1.96 1.94 1.92 1.87 1.82 
3.31 317 3.03 2.88 2.79 2.72 2.64 2.58 2.55 2.46 237 
4.95 4.70 4.44 4.17 4.00 3.88 3.74 3.64 3.58 3.42 3.28 
1.90 1.86 1.81 1.76 1.73 1.70 1.67 1.65 1.64 1.60 1.57 
2.30 2.23 ZAS 2.07 2.02 1.98 1.94 1.91 1.89 1.84 1.79 
3.26 3.12 2.98 2.83 2.73 2.67 2.58 2.53 2.50 2.40 2.32 
4.83 4.58 4.33 4.06 3.89 3.78 3.63 3.54 3.48 3.32 el 
1.89 1.84 1.80 1.74 1.71 1.69 1.66 1.64 1.62 1.59 1.55 
2.27 2.20 2.13 2.05 2.00 1.96 1.91 1.88 1.86 1.81 1.76 
3.21 3.07 2.93 2.78 2.69 2.62 2.54 2.48 2.45 2.35 2.27 
4.73 4.48 4.23 3.96 3.79 3.68 3.53 3.44 3.38 3.22 3.08 
1.88 1.83 1.78 1.73 1.70 1.67 1.64 1.62 1.61 1.57 1.54 
2.25 2.18 2.11 2.03 1.97 1.94 1.89 1.86 1.84 1.79 1.74 
3.17 3.03 2.89 2.74 2.64 2.58 2.49 2.44 2.40 2.3 2.22 
4.64 4.39 4.14 3.87 3.71 3.59 3.45 3.36 3.29 3.14 2.99 
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A-18 Appendix Tables 


Table A.9 = Critical Values for F Distributions (cont.) 


v, = numerator df 
a 1 2 3 4 5 6 7 8 9 

-100 2.92 2.53 2.32 2.18 2.09 2.02 1.97 1.93 1.89 

25 050 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 

010 7.77 5.57 4.68 4.18 3.85 3.63 3.46 3.32 3.22 

001 13.88 9.22 TAS 6.49 5.89 5.46 5.15 4.91 4.71 

-100 2.91 2.52 2.31 2:17 2.08 2.01 1.96 1.92 1.88 

26 050 4.23 3:37 2.98 2.74 2.59 2.47 2.39 Doe 2.27 

010 7.72 5.53 4.64 4.14 3.82 3.59 3.42 3.29 3.18 

001 13.74 9.12 7.36 6.41 5.80 5.38 5.07 4.83 4.64 

100 2.90 2.51 2.30 2.17 2.07 2.00 1.95 1.91 1.87 

7 050 4.21 3:3) 2.96 2.13 2.57 2.46 2.31 2.31 2.25 

.010 7.68 5.49 4.60 4.11 3.78 3.56 3.39 3.26 3.15 

001 13.61 9.02 7.27 6.33 5.73 3:31 5.00 4.76 4.57 

-100 2.89 2.50 2.29 2.16 2.06 2.00 1.94 1.90 1.87 

28 050 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 

010 7.64 5.45 4.57 4.07 3.75 3.53 3.36 3.23 3.12 

001 13.50 8.93 7.19 6.25, 5.66 5.24 4.93 4.69 4.50 

-100 2.89 2.50 2.28 2.15 2.06 1.99 1.93 1.89 1.86 

29 050 4.18 3:33 2.93 2.70 2.55 2.43 2.39 2.28 2.22 

010 7.60 5.42 4.54 4.04 3.73 3.50 3:33 3.20 3.09 

001 13.39 8.85 7TA2 6.19 5.59 5.18 4.87 4.64 4.45 

S 100 2.88 2.49 2.28 2.14 2.05 1.98 1.93 1.88 1.85 

é 30 050 4.17 3:32 2.92 2.69 2.93 2.42 2.33 2.27 2.21 

PS 010 7.56 5.39 4.51 4.02 3.10 3.47 3.30 3.17 3.07 

‘e 001 13.29 8.77 7.05 6.12 5.53 3.12 4.82 4.58 4.39 
—) 

5 100 2.84 2.44 2.23 2.09 2.00 1.93 1.87 1.83 1.79 

il 40 050 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 

RY .010 7.31 5.18 4.31 3.83 3251 3:29 3.12 2.99 2.89 

001 12.61 8.25 6.59 5.70 5.13 4.73 4.44 4.21 4.02 

-100 2.81 2.41 2.20 2.06 1.97 1.90 1.84 1.80 1.76 

50 050 4.03 3.18 2.79 2.56 2.40 2.29 2.20 2.13 2.07 

010 7.17 5.06 4.20 3.72 3.41 3.19 3.02 2.89 2.78 

001 12.22 7.96 6.34 5.46 4.90 4.51 4.22 4.00 3.82 

100 2.79 2.39 2.18 2.04 1.95 1.87 1.82 1.77 1.74 

60 050 4.00 3.15 2.76 2:93 2.37 2.25 2.17 2.10 2.04 

010 7.08 4.98 4.13 3.65 3.34 3.12 2:95 2.82 Bhd 

001 11.97 7.77 6.17 5.31 4.76 4.37 4.09 3.86 3.69 

100 2.76 2.36 2.14 2.00 1.91 1.83 1.78 1.73 1.69 

100 050 3.94 3.09 2.70 2.46 2.31 2.19 2.10 2.03 1.97 

010 6.90 4.82 3.98 3.51 3.21 2.99 2.82 2.69 2.59 

001 11.50 741 5.86 5.02 4.48 4.11 3.83 3.61 3.44 

-100 2.73 233 2.11 1.97 1.88 1.80 1.79 1.70 1.66 

200 -050 3.89 3.04 2.65 2.42 2.26 2.14 2.06 1.98 1.93 

010 6.76 4.71 3.88 3.41 3.11 2.89 2.13 2.60 2.50 

001 11.15 TAS 5.63 4.81 4.29 3.92 3.65 3.43 3.26 

-100 2.71 2.31 2.09 1.95 1.85 1.78 1.72 1.68 1.64 

1000 050 3.85 3.00 2.61 2.38 2.22 2.11 2.02 1.95 1.89 

010 6.66 4.63 3.80 3.34 3.04 2.82 2.66 2.53 2.43 

001 10.89 6.96 5.46 4.65 4.14 3.78 3.51 3.30 3.13 
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Appendix Tables A-19 


Table A.9 Critical Values for F Distributions (cont.) 


v, = numerator df 


10 12 15 20 25 30 40 50 60 120 1000 
1.87 1.82 1.77 1.72 1.68 1.66 1.63 1.61 1.59 1.56 1.52 
2.24 2.16 2.09 2.01 1.96 1.92 1.87 1.84 1.82 1.77 1.72 
3:13 2.99 2.85 2.70 2.60 2.54 2.45 2.40 2.36 2.27 2.18 
4.56 4.31 4.06 3.79 3.63 3.52 3.37 3.28 3.22 3.06 2.91 
1.86 1.81 1.76 1.71 1.67 1.65 1.61 1.59 1.58 1.54 1.51 
2.22 2.15 2.07 1.99 1.94 1.90 1.85 1.82 1.80 1.75 1.70 
3.09 2.96 2.81 2.66 2.57 2.50 2.42 2.36 2.33 2.23 2.14 
4.48 4.24 3.99 Sal, 3.56 3.44 3.30 3.21 3.15 2.99 2.84 
1.85 1.80 12/3 1.70 1.66 1.64 1.60 1.58 1.57 1.53 1.50 
2.20 2.13 2.06 1.97 1.92 1.88 1.84 1.81 1.79 1.73 1.68 
3.06 2.93 2.78 2.63 2.54 2.47 2.38 2.33 2.29 2.20 2.11 
4.41 4.17 3.92 3.66 3.49 3.38 3.23 3.14 3.08 2.92 2.78 
1.84 1.79 1.74 1.69 1.65 1.63 1.59 1.57 1.56 1.52 1.48 
2:19 2.12 2.04 1.96 1.91 1.87 1.82 1.79 1.77 1.71 1.66 
3.03 2.90 2.75 2.60 2.51 2.44 2.35 2.30 2.26 2.17 2.08 
4.35 4.11 3.86 3.60 3.43 3.32 3.18 3.09 3.02 2.86 2.72 
1.83 1.78 1/3 1.68 1.64 1.62 1.58 1.56 1.55 1.51 1.47 
2.18 2.10 2.03 1.94 1.89 1.85 1.81 177 1.75 1.70 1.65 
3.00 2.87 2.73 257 2.48 2.41 2.33 2.27 2.23 2.14 2.05 
4.29 4.05 3.80 3.54 3.38 3.27 3.12 3.03 2.97 2.81 2.66 
1.82 LT 1.72 1.67 1.63 1.61 1:57 1.55 1.54 1.50 1.46 
2.16 2.09 2.01 1.93 1.88 1.84 1.79 1.76 1.74 1.68 1.63 
2.98 2.84 2.70 2.55 2.45 2.39 2.30 2:29 2.21 2.11 2.02 
4.24 4.00 3.75 3.49 3.33 3.22 3.07 2.98 2.92 2.76 2.61 
1.76 1.71 1.66 1.61 7 1.54 1:51 1.48 1.47 1.42 1.38 
2.08 2.00 1.92 1.84 1.78 1.74 1.69 1.66 1.64 1.58 1.52 
2.80 2.66 252 2.37 2.27 2.20 2A1 2.06 2.02 1.92 1.82 
3.87 3.64 3.40 3.14 2.98 2.87 2.13 2.64 2.57 2.41 2.25 
1.73 1.68 1.63 1.57 1.53 1.50 1.46 1.44 1.42 1.38 1.33 
2.03 1.95 1.87 1.78 1.73 1.69 1.63 1.60 1.58 1.51 1.45 
2.70 2.56 2.42 2.27 ZL 2.10 2.01 1.95 1.91 1.80 1.70 
3.67 3.44 3.20 2.95 2.19 2.68 2.53 2.44 2.38 2.21 2.05 
1.71 1.66 1.60 1.54 1.50 1.48 1.44 1.41 1.40 1.35 1.30 
1.99 1.92 1.84 1.75 1.69 1.65 1.59 1.56 1.53 1.47 1.40 
2.63 2.50 2.35 2.20 2.10 2.03 1.94 1.88 1.84 173 1.62 
3.54 3.32 3.08 2.83 2.67 2:35 2.41 2.32 2.25 2.08 1.92 
1.66 1.61 1.56 1.49 1.45 1.42 1.38 1.35 1.34 1.28 1.22 
1.93 1.85 ee 1.68 1.62 1.57 1.52 1.48 1.45 1.38 1.30 
2.50 2.37 2.22 2.07 1.97 1.89 1.80 1.74 1.69 1.57 1.45 
3.30 3.07 2.84 2.59 2.43 2.32 2.17 2.08 2.01 1.83 1.64 
1.63 1.58 1.52 1.46 1.41 1.38 1.34 1.31 1.29 1.23 1.16 
1.88 1.80 1.72 1.62 1.56 1.52 1.46 1.41 1.39 1.30 1.21 
2.41 221 2.13 1.97 1.87 1.79 1.69 1.63 1.58 1.45 1.30 
3.12 2.90 2.67 2.42 2.26 21D 2.00 1.90 1.83 1.64 1.43 
1.61 1.55 1.49 1.43 1.38 1.35 1.30 1.27 1.25 1.18 1.08 
1.84 1.76 1.68 1.58 1.52 1.47 1.41 1.36 1.33 1.24 1.11 
2.34 2.20 2.06 1.90 1.79 1.72 1.61 1.54 1.50 35 1.16 


2.99 2.77 2.54 2.30 2.14 2.02 1.87 1.77 1.69 1.49 1.22 
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A-20 Appendix Tables 


Table A.10 Critical Values for Studentized Range Distributions 


5 05 3.64 4.60 5.22 5.67 6.03 6.33 6.58 6.80 6.99 TAT 7.32 
01 5.70 6.98 7.80 8.42 8.91 9.32 9.67 9.97 10.24 10.48 10.70 


6 05 3.46 4.34 4.90 5.30 5.63 5.90 6.12 6.32 6.49 6.65 6.79 
01 5.24 6.33 7.03 7.56 7.97 8.32 8.61 8.87 9.10 9.30 9.48 


7 05 3.34 4.16 4.68 5.06 5.36 5.61 5.82 6.00 6.16 6.30 6.43 


01 4.95 5.92 6.54 7.01 7.37 7.68 7.94 8.17 8.37 8.55 8.71 
8 05 3.26 4.04 4.53 4.89 5.17 5.40 5.60 Sef 5.92 6.05 6.18 
O01 4.75 5.64 6.20 6.62 6.96 7.24 TAT 7.68 7.86 8.03 8.18 


9 05 3.20 3.95 4.41 4.76 5.02 5.24 5.43 39 5.74 5.87 5.98 
01 4.60 5.43 5.96 6.35 6.66 6.91 7.13 133 7.49 7.65 7.78 


10 05 3:15 3.88 4.33 4.65 4.91 5.12 5.30 5.46 5.60 a:42 5.83 
01 4.48 5.27 5.77 6.14 6.43 6.67 6.87 7.05 7.21 7.36 7.49 
11 05 3.11 3.82 4.26 4.57 4.82 5.03 5.20 3:39 5.49 5.61 5.71 
01 4.39 3.15 5.62 5.97 6.25 6.48 6.67 6.84 6.99 LAS 7.25 
12 05 3.08 3.77 4.20 4.51 4.75 4.95 5.12 a2 5.39 5.51 5.61 
01 4.32 5.05 5.50 5.84 6.10 6.32 6.51 6.67 6.81 6.94 7.06 
13 05 3.06 3.43 4.15 4.45 4.69 4.88 5.05 5.19 3:32 5.43 5:53 
01 4.26 4.96 5.40 3:13 5.98 6.19 6.37 6.53 6.67 6.79 6.90 
14 05 3.03 3.70 4.11 4.41 4.64 4.83 4.99 5.13 5.25 5.36 5.46 
01 4.21 4.89 5.32 5.63 5.88 6.08 6.26 6.41 6.54 6.66 6.77 
15 05 3.01 3.67 4.08 4.37 4.59 4.78 4.94 5.08 5.20 5.31 5.40 
O01 4.17 4.84 3:25 5.56 5.80 5.99 6.16 6.31 6.44 6.55 6.66 
16 05 3.00 3.65 4.05 4.33 4.56 4.74 4.90 5.03 5.15 5.26 5.35 
01 4.13 4.79 5.19 5.49 3./2 5.92 6.08 6.22 6.35 6.46 6.56 
17 05 2.98 3.63 4.02 4.30 4.52 4.70 4.86 4.99 5.11 5.21 5.31 
O01 4.10 4.74 5.14 5.43 5.66 5.85 6.01 6.15 6.27 6.38 6.48 
18 05 2.97 3.61 4.00 4.28 4.49 4.67 4.82 4.96 5.07 5.17 5.27 


01 4.07 4.70 5.09 5.38 5.60 5.79 5.94 6.08 6.20 6.31 6.41 


19 05 2.96 3.59 3.98 4.25 4.47 4.65 4.79 4.92 5.04 5.14 5.23 
01 4.05 4.67 5.05 2.33 5.55 3.13 5.89 6.02 6.14 6.25 6.34 
20 05 2.95 3.58 3.96 4.23 4.45 4.62 4.77 4.90 5.01 5.11 5.20 
01 4.02 4.64 5.02 5.29 5.51 5.69 5.84 5.97 6.09 6.19 6.28 
24 05 2.92 3.53 3.90 4.17 4.37 4.54 4.68 4.81 4.92 5.01 5.10 
01 3.96 4.55 4.91 5.17 33 5.54 5.69 5.81 5.92 6.02 6.11 
30 05 2.89 3.49 3.85 4.10 4.30 4.46 4.60 4.72 4.82 4.92 5.00 
01 3.89 4.45 4.80 5.05 5.24 5.40 5.54 5.65 5.76 5.85 3:93 
40 05 2.86 3.44 3.19 4.04 4.23 4.39 4.52 4.63 4.73 4.82 4.90 
01 3.82 4.37 4.70 4.93 5.11 5.26 339 5.50 5.60 5.69 5.76 


60 05 2.83 3.40 3.74 3.98 4.16 4.31 4.44 4.55 4.65 4.73 4.81 
01 3.76 4.28 4.59 4.82 4.99 5.13 5.25 5.36 5.45 3:93 5.60 

120 .05 2.80 3.36 3.68 3.92 4.10 4.24 4.36 4.47 4.56 4.64 4.71 
01 3.70 4.20 4.50 4.71 4.87 5.01 5.12 5.21 5.30 5.37 5.44 

0 05 2.77 3.31 3.63 3.86 4.03 4.17 4.29 4.39 4.47 4.55 4.62 


01 3.64 4.12 4.40 4.60 4.76 4.88 4.99 5.08 5.16 3.23 3.29 
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Appendix Tables 


Table A.11 Chi-Squared Curve Tail Areas 


A-21 


Upper-Tail Area v=1 yp=2 vy =3 vp=4 p=5 
> .100 < 2.70 < 4.60 < 6.25 < 717 < 9.23 
.100 2.70 4.60 6.25 7.77 9.23 
095 2.78 4.70 6.36 7.90 9.37 
.090 2.87 4.81 6.49 8.04 9.52 
085 2.96 4.93 6.62 8.18 9.67 
080 3.06 5.05 6.75 8.33 9.83 
075 3.17 5.18 6.90 8.49 10.00 
070 3.28 5.31 7.06 8.66 10.19 
065 3.40 5.46 T22 8.84 10.38 
060 3.53 5.62 7.40 9.04 10.59 
055 3.68 5.80 7.60 9.25 10.82 
.050 3.84 5.99 7.81 9.48 11.07 
045 4.01 6.20 8.04 9.74 11.34 
.040 4.21 6.43 8.31 10.02 11.64 
.035 4.44 6.70 8.60 10.34 11.98 
.030 4.70 7.01 8.94 10.71 12.37 
025 5.02 7.37 9.34 11.14 12.83 
.020 5.41 7.82 9.83 11.66 13.38 
015 5.91 8.39 10.46 12.33 14.09 
.010 6.63 9.21 11.34 13.27 15.08 
005 7.87 10.59 12.83 14.86 16.74 
001 10.82 13.81 16.26 18.46 20.51 
< .001 > 10.82 > 13.81 > 16.26 > 18.46 > 20.51 
Upper-Tail Area v=6 v=7 v=8 v=9 v= 10 
> .100 < 10.64 < 12.01 < 13.36 < 14.68 < 15.98 
100 10.64 12.01 13.36 14.68 15.98 
095 10.79 12.17 13.52 14.85 16.16 
090 10.94 12.33 13.69 15.03 16.35 
.085 11.11 12.50 13.87 15.22 16.54 
.080 11.28 12.69 14.06 15.42 16.75 
075 11.46 12.88 14.26 15.63 16.97 
.070 11.65 13.08 14.48 15.85 17.20 
065 11.86 13.30 14.71 16.09 17.44 
.060 12.08 13.53 14.95 16.34 ieeeal 
055 12.33 13.79 15.22 16.62 17.99 
.050 12.59 14.06 15.50 16.91 18.30 
045 12.87 14.36 15.82 17.24 18.64 
.040 13:19 14.70 16.17 17.60 19.02 
.035 13.55 15.07 16.56 18.01 19.44 
.030 13.96 15.50 17.01 18.47 19.92 
025 14.44 16.01 17.53 19.02 20.48 
.020 15.03 16.62 18.16 19.67 21.16 
015 15.77 17.39 18.97 20.51 22.02 
.010 16.81 18.47 20.09 21.66 23.20 
005 18.54 20.27 21.95 23.58 25.18 
001 22.45 24.32 26.12 27.87 29.58 
<.001 > 22.45 > 24.32 > 26.12 > 27.87 > 29.58 
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A-22 Appendix Tables 


Table A.11 —Chi-Squared Curve Tail Areas (cont.) 


Upper-Tail Area v=11 vy=12 v= 13 v=14 yp=15 
> .100 < 17.27 < 18.54 < 19.81 < 21.06 < 22.30 
-100 17.27 18.54 19.81 21.06 22.30 
095 17.45 18.74 20.00 21.26 22.51 
.090 17.65 18.93 20.21 21.47 22.73 
085 17.85 19.14 20.42 21.69 22.95 
.080 18.06 19.36 20.65 21.93 23.19 
075 18.29 19.60 20.89 22.17 23.45 
.070 18.53 19.84 21.15 22.44 23572 
065 18.78 20.11 21.42 22.71 24.00 
.060 19.06 20.39 21.71 23.01 24.31 
055 19.35 20.69 22.02 23:33 24.63 
.050 19.67 21.02 22.36 23.68 24.99 
045 20.02 21.38 22.73 24.06 25.38 
.040 20.41 21.78 23.14 24.48 25.81 
.035 20.84 22.23 23.60 24.95 26.29 
.030 21.34 22.74 24.12 25.49 26.84 
025 21.92 23.33 24.73 26.11 27.48 
.020 22.61 24.05 25.47 26.87 28.25 
015 23.50 24.96 26.40 27.82 29.23 
.010 24.72 26.21 27.68 29.14 30.57 
005 26.75 28.29 29.81 31.31 32.80 
001 31.26 32.90 34.52 36.12 37.69 
<.001 > 31.26 > 32.90 > 34.52 > 36.12 > 37.69 
Upper-Tail Area v= 16 vy=17 v= 18 vy =19 v= 20 
> .100 < 23.54 < 24.77 < 25.98 < 27.20 < 28.41 
100 23.54 24.76 25.98 27.20 28.41 
095 23.75 24.98 26.21 27.43 28.64 
.090 23.97 25.21 26.44 27.66 28.88 
085 24.21 25.45 26.68 27.91 29.14 
.080 24.45 25.70 26.94 28.18 29.40 
075 24.71 25.97 27.21 28.45 29.69 
.070 24.99 26.25 27.50 28.75 29.99 
065 25.28 26.55 27.81 29.06 30.30 
.060 25:59 26.87 28.13 29.39 30.64 
055 25.93 27.21 28.48 29.75 31.01 
.050 26.29 27.58 28.86 30.14 31.41 
045 26.69 27.99 29.28 30.56 31.84 
.040 27.13 28.44 29.74 31.03 32.32 
035 27.62 28.94 30.25 31.56 32.85 
.030 28.19 29.52 30.84 32.15 33.46 
025 28.84 30.19 31.52 32.85 34.16 
020 29.63 30.99 32.34 33.68 35.01 
015 30.62 32.01 33.38 34.74 36.09 
.010 32.00 33.40 34.80 36.19 37.56 
005 34.26 35.71 37:15 38.58 39.99 
001 39.25 40.78 42.31 43.81 45.31 


< .001 > 39.25 > 40.78 > 42.31 > 43.81 > 45.31 


Copyright 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). 
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 


Appendix Tables A-23 


Table A.12 Approximate Critical Values for the 
Ryan-Joiner Test of Normality 


a 
10 05 01 

4 8951 8734 8318 
5 .9033 8804 8319 
6 9114 8893 .8409 
7 9186 8978 8517 
8 9248 9054 8622 
9 9301 9121 8718 
10 9347 .9179 8804 
11 9387 9230 8880 
12 9422 9275 8947 
13 9454 9315 .9008 
14 9481 9351 9061 
n 15 9506 9383 .9109 
16 9529 9411 9153 
17 9549 9437 9192 
18 9567 9461 9228 
19 9584 9483 .9260 
20 .9600 9503 9290 
25 .9662 9582 .9407 
30 .9707 9639 .9490 
40 9767 9715 9597 
50 .9807 9764 .9664 
60 9835 9799 .9709 
75 9865 9835 .9756 


Source: Minitab Reference Manual. 
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A-24 Appendix Tables 


Table A.13 Critical Values for the Wilcoxon Signed-Rank Test PS, 2 c,) = P(S, = c, when A, is true) 
n Cy P\(S, = ey) n Cy P\(S, =e) 
3 6 125 78 O11 
4 9 125 719 .009 
10 .062 81 .005 
5 13 094 14 73 108 
14 .062 74 097 
15 031 719 052 
6 17 .109 84 025 
19 047 89 .010 
20 031 92 .005 
21 .016 15 83 104 
z 22 .109 84 094 
24 055 89 053 
26 .023 90 047 
28 .008 95 024 
8 28 .098 100 O11 
30 .055 101 .009 
32 027 104 005 
34 012 16 93 .106 
35 .008 94 .096 
36 004 100 052 
9 34 102 106 025 
37 049 112 O11 
39 027 113 009 
42 .010 116 005 
44 .004 17 104 .103 
10 41 097 105 095 
44 053 112 049 
47 024 118 025 
50 .010 125 .010 
52 005 129 005 
11 48 .103 18 116 098 
52 051 124 049 
55 027 131 024 
59 009 138 010 
61 005 143 .005 
12 56 102 19 128 .098 
60 055 136 052 
61 .046 137 048 
64 .026 144 025 
68 010 152 010 
71 .005 157 005 
13 64 .108 20 140 101 
65 095, 150 049 
69 055 158 024 
70 047 167 .010 
74 024 172 .005 
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Appendix Tables A-25 


Table A.14 Critical Values for the Wilcoxon Rank-Sum Test PW = c) = P(W =c when Hj is true) 
m n c P\(W=c) m n c P\(W=c) 
3 3 15 05 40 .004 
4 17 .057 6 40 041 
18 .029 41 .026 
5 20 .036 43 .009 
21 018 44 .004 
6 22 .048 7 43 053 
23 .024 45 .024 
24 .012 47 .009 
7 24 .058 48 .005 
26 .O17 8 47 .047 
27 .008 49 023 
8 27 .042 51 .009 
28 .024 52 .0O5 
29 .012 6 6 50 .047 
30 .006 52 021 
4 4 24 .057 54 .008 
25 .029 55 .004 
26 .014 7 54 O51 
5 2) .056 56 .026 
28 .032 58 O11 
29 .016 60 .004 
30 .008 8 58 .054 
6 30 .057 61 021 
32 .019 63 O01 
33 .010 65 .004 
34 .0OS 7 7 66 .049 
Z 33 .055 68 .027 
35 .021 74 .009 
36 .012 72 .006 
37 .006 8 71 .047 
8 36 .055 73 .027 
38 .024 76 O01 
40 .008 78 .0O5 
41 .004 8 8 84 .052 
5 5 36 048 87 .025 
37 .028 90 OL 
39 .008 92 .005 
39 .008 92 .0O05 
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A-26 Appendix Tables 


Table A.15 Critical Values for the Wilcoxon Signed-Rank Interval (Xtnin+ 1/2—-c+ 1» X(c) 
Confidence Confidence Confidence 
n Level (%) c n Level (%) c n Level (%) c 
5 93.8 15 13 99.0 81 20 99.1 173 
87.5 14 95.2 74 95.2 158 
6 96.9 21 90.6 70 90.3 150 
93.7 20 14 99.1 93 21 99.0 188 
90.6 19 95.1 84 95.0 172 
7 98.4 28 89.6 719 89.7 163 
95.3 26 15 99.0 104 22 99.0 204 
89.1 24 95.2 95 95.0 187 
8 99.2 36 90.5 90 90.2 178 
94.5 32 16 99.1 117 23 99.0 221 
89.1 30 94.9 106 95.2 203 
9 99.2 44 89.5 100 90.2 193 
94.5 39 17 99.1 130 24 99.0 239 
90.2 37 94.9 118 95.1 219 
10 99.0 52 90.2 112 89.9 208 
95.1 47 18 99.0 143 25 99.0 257 
89.5 44 95.2 131 95.2 236 
11 99.0 61 90.1 124 89.9 224 
94.6 55 19 99.1 158 
89.8 52 95.1 144 
12 99.1 71 90.4 137 
94.8 64 
90.8 61 
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Appendix Tables A-27 


Table A.16 Critical Values for the Wilcoxon Rank-Sum Interval (Pitman -ex rye Aye) 
Smaller Sample Size 
5 6 7 8 
Larger Confidence Confidence Confidence Confidence 
Sample Size Level (%) c Level (%) c Level (%) c Level (%) c 
5 99.2 25 
94.4 22 
90.5 21 
6 99.1 29 99.1 34 
94.8 26 95.9 31 
91.8 25 90.7 29 
7 99.0 33 99.2 39 98.9 44 
95.2 30 94.9 35 94.7 40 
89.4 28 89.9 33 90.3 38 
8 98.9 37 99.2 44 99.1 50 99.0 56 
95.5 34 95.7 40 94.6 45 95.0 51 
90.7 32 89.2 37 90.6 43 89.5 48 
9 98.8 4] 99.2 49 99.2 56 98.9 62 
95.8 38 95.0 tet 94.5 50 95.4 57 
88.8 35 91.2 42 90.9 48 90.7 54 
10 99.2 46 98.9 53 99.0 61 99.1 69 
94.5 4] 94.4 48 94.5 55 94.5 62 
90.1 39 90.7 46 89.1 52 89.9 59 
11 99.1 50 99.0 58 98.9 66 99.1 75 
94.8 45 95.2 53 95.6 61 94.9 68 
91.0 43 90.2 50 89.6 57 90.9 65 
12 99.1 54 99.0 63 99.0 72 99.0 81 
95.2 49 94.7 57 95.5 66 95.3 74 
89.6 46 89.8 54 90.0 62 90.2 70 
Smaller Sample Size 
9 10 11 12 
Larger Confidence Confidence Confidence Confidence 
Sample Size Level (%) c Level (%) c Level (%) c Level (%) c 
9 98.9 69 
95.0 63 
90.6 60 
10 99.0 76 99.1 84 
94.7 69 94.8 76 
90.5 66 89.5 72 
11 99.0 83 99.0 91 98.9 99 
95.4 76 94.9 83 95.3 91 
90.5 72 90.1 79 89.9 86 
12 99.1 90 99.1 99 99.1 108 99.0 116 
95.1 82 95.0 90 94.9 98 94.8 106 
90.5 78 90.7 86 89.6 93 89.9 101 
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Answers to Selected 


Odd-Numbered Exercises 


Chapter 1 


1. a. Los Angeles Times, Oberlin Tribune, Gainesville Sun, 


Washington Post 

b. Duke Energy, Clorox, Seagate, Neiman Marcus 

c. Vince Correa, Catherine Miller, Michael Cutler, Ken Lee 
d. 2.97, 3.56, 2.20, 2.97 


. a. How likely is it that more than half of the sampled com- 
puters will need or have needed warranty service? What 
is the expected number among the 100 that need warranty 
service? How likely is it that the number needing warranty 
service will exceed the expected number by more than 10? 
b. Suppose that 15 of the 100 sampled needed warranty 
service. How confident can we be that the proportion of 
all such computers needing warranty service is between 
.08 and .22? Does the sample provide compelling evidence 
for concluding that more than 10% of all such computers 
need warranty service? 


. a. No. All students taking a large statistics course who 
participate in an SI program of this sort. 

b. Randomization protects against various biases and 
helps ensure that those in the SI group are as similar as 
possible to the students in the control group. 

c. There would be no firm basis for assessing the effec- 
tiveness of SI (nothing to which the SI scores could rea- 
sonably be compared). 


. One could generate a simple random sample of all single- 
family homes in the city, or a stratified random sample by 
taking a simple random sample from each of the 10 district 
neighborhoods. From each of the selected homes, values 
of all desired variables would be determined. This would 
be an enumerative study because there exists a finite, iden- 
tifiable population of objects from which to sample. 


. a. Possibly measurement error, recording error, differences 
in environmental conditions at the time of measurement, etc. 


11. 


13. 


b. No. There is no sampling frame. 


3L | 1 

3H | 56678 

4L | 000112222234 

4H | 5667888 

SL | 144 stem: tenths digit 
5H | 58 

6L | 2 

6H | 6678 

7L 

7H | 5 


The stem-and-leaf display shows that .45 is a good rep- 
resentative value for the data. In addition, the display is 
not symmetric and appears to be positively skewed. The 
range of the data is .75 — .31 = .44, which is comparable 
to the typical value of .45. This constitutes a reasonably 
large amount of variation in the data. The data value .75 
is a possible outlier. 


a. 122 
12 445 
12 6667777 
12 889999 
13 OOO11111111 
13 222222222233333333333333 
13 44444444444444444455555555555555555555 
13 6666666666667777777777 
13 888888888888999999 
14 0000001111 
14 2333333 
14 444 
14 77 
symmetry 


leaf: ones digit 


A-29 
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A-30 Answers to Selected Odd-Numbered Exercises 


15. 


17. 


19. 


21. 


23. 


25. 


b. Close to bell-shaped, center ~ 135, not insignificant 
dispersion, no gaps or outliers. 


Am Fr 
8] 1 
157020153504 | 9} 00645632 
9324 | 10] 2563 
6306 | 11 | 6913 
Stem: Hundreds and tens 058 |} 12 | 325528 
Leaf: Ones 8] 13] 7 
14 
15] 8 
2116 


Representative values: low 100s for Am and low 110’s 
for Fr. Somewhat more variability in Fr times than in 
Am times. More extreme positive skew for Am than for Fr. 
162 is an Am outlier, and 158 is perhaps a Fr outlier. 


a. .639, .510 b. .491, .315 

c. The relative frequencies are .065, .185, .241, .148, .102, 
.083, .056, .074, .028, and .019. The histogram is close to 
being unimodal (the peak at 9 would likely disappear with 
a significantly larger sample size) and positively skewed. A 
typical value of x here is 5, but there is substantial variability 
about that typical value. The data set contains no outliers. 


a. .99 (99%), .71 (71% b. .64 (64%), .44 (44%) 

c. Strictly speaking, the histogram is not unimodal, but 
is close to being so with a moderate positive skew. A much 
larger sample size would likely give a smoother picture. 


a.y Freq. Rel. freq. b.z Freq. Rel. freq. 
0 17 362 0 13 277 
1 22 468 1 11 234 
2 6 128 2 3 .064 
3 1 021 3 7 149 
4 0 .000 4 5 106 
5 1 021 5 3 .064 

47 1.000 6 3 .064 
362, .638 7 0 .000 
8 2  _043 
47 1.001 

.894, .830 


The class widths are not equal, so the density scale must 
be used. The densities for the six classes are .2030, .1373, 
.0303, .0086, .0021, and .0009, respectively. The resulting 
histogram is unimodal with a very substantial positive skew. 


Class Freq. Class Freq. 
10-<20 8 1.1-<1.2 2 
20-<30 14 1.2-<1.3 6 
30-<40 8 1.3-<1.4 7 
40-<50 4 1.4-<1.5 9 
50-<60 3 1.5-<1.6 6 
60-<70 2 1.6-<1.7 4 
70-<80 1 1.7-<1.8 5 
40 1.8-<1.9 ell 
40 
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27. 


Original: positively skewed; 
Transformed much more symmetric, not far from bell- 
shaped. 


a. The observation 50 falls on a class boundary. 


b. — Class Freq. Rel. freq. 

0-<50 9 .18 
50-<100 19 38 
100-<150 11 22 
150-<200 4 .08 
200-<300 ob .08 
300-<400 2 04 
400-<500 0 .00 
500-<600 tI _.02 

50 1.00 


A representative (central) value is either a bit below or a bit 
above 100, depending on how one measures center. There 
is a great deal of variability in lifetimes, especially in values 
at the upper end of the data. There are several candidates 
for outliers. 


c. Class Freq. Rel. freq. 
2.25-<2.75 2 04 
2.75-<3.25 2 04 
3.25-<3.75 3 .06 
3.75-<4.25 8 .16 
4.25-<4.75 18 36 
4.75-<5.25 10 20 
5.25-<5.75 4 .08 
5.75-<6.25 3 _.06 

50 1.00 


There is much more symmetry in the distribution of the 
In(x) values than in the x values themselves, and less vari- 
ability. There are no longer gaps or obvious outliers. 

d. .38, .14 


29. A: .28 B: .19 C: 18 
D: .17 E: .09 F: .09 
31. Class Freq. Cum. freq. Cum. rel. freq. 
0-<4 2 2 .050 
4-<8 14 16 400 
8-<12 11 27 675 
12-<16 8 35 .875 
16-<20 4 39 975 
20-<24 0 39 975 
24-<28 1 40 1.000 
33. a. 640.5, 582.5 


iv.) 
nn 


37. 


b. 610.5, 582.5 
ec. 591.2 
d. 593.71 


. a. 1.237, .56; positive skew 


b. 1.118; in between the two 
c. .36 


Xuc1o) = 11.46 


39. 
41. 
43. 
45. 


47. 
49. 
51. 
53. 


55. 


57. 


59. 


61. 


63. 


x = 1.0297, x = 1.009 
ahh b. Also .7 ce. 13 
= 68.0, X,.(29) = 66.2, X 4730) = 67.5 


. x = 115.58; the deviations are .82, .32, —.98, —.38, .22 
. .482, .694 c. .482 d. .482 


a 
b. 
a. x = 14.88, x = 14.70 b. .837 
a 
a 
a 


b. .383 


2 


c. .837 
- 56.80, 197.8040 b. .5016, .708 
. 1264.766, 35.564 b. .351, 593 


. Bal: 1.121, 1.050, .536 

Gr: 1.244, 1.100, .448 

b. Typical ratios are quite similar for the two types. There is 
somewhat more variability in the Bal sample, due primarily 
to the two outliers (one mild, one extreme). For Bal, there is 
substantial symmetry in the middle 50% but positive skew- 
ness overall. For Gr, there is substantial positive skew in the 
middle 50% and mild positive skewness overall. 


a. 33 b. No 

c. Slight positive skewness in the middle half, but rather sym- 
metric overall. The extent of variability appears substantial. 
d. At most 32 


a. Yes. 125.8 is an extreme outlier and 250.2 is a mild 
outlier. 

b. In addition to the presence of outliers, there is positive 
skewness both in the middle 50% of the data and, excepting 
the outliers, overall. Except for the two outliers, there appears 
to be a relatively small amount of variability in the data. 


a. ED: .4, .10, 2.75, 2.65; 

Non-Ed: 1.60, .30, 7.90, 7.60 

b. ED: 8.9 and 9.2 are mild outliers, and 11.7 and 21.0 are 
extreme outliers. 

There are not outliers in the non-ED sample. 

c. Four outliers for ED, none for non-ED. Substantial 
positive skewness in both samples; less variability in ED 
(smaller f,), and non-ED observations tend to be somewhat 
larger than ED observations. 


Outliers, both mild and extreme, only at 6 a.m. Distributions 
at other times are quite symmetric. Variability increases 
somewhat until 2 p.M. and then decreases slightly, and the 
same is true of “typical” gasoline-vapor coefficient values. 
x = 64.89, X = 64.70, s = 7.803, lower 4 = 57.8, upper 
4" = 70.4, f, = 12.6. A histogram consisting of 8 classes 
starting at 52, each of width 4, is bimodal but close to uni- 
modal with a positive skew. A boxplot shows no outliers, 


67. 


69. 
71. 


73. 


75. 


77. 


79. 


81. 
83. 


Answers to Selected Odd-Numbered Exercises A-31 
there is a very mild negative skew in the middle 50%, and 


the upper whisker is much longer than the lower whisker. 
b. .9231, .9053 


c. .48 
a. M:x = 3.64, X = 3.70, s = .269, f. = .40 
F:x = 3.28, ¥ = 3.15, 5 = 478, f. = .50 


Female values are typically somewhat smaller than male 
values, and show somewhat more variability. An M box- 
plot shows negative skew whereas an F boxplot shows 
positive skew. 

b. F: X10) = 3.24 Mi Xya0) = 3.652 ~ 3.65 


b. 189.14, 1.87 


a. The mean, median, and trimmed mean are virtually 
identical, suggesting a substantial amount of symmetry in 
the data; the fact that the quartiles are roughly the same 
distance from the median and that the smallest and larg- 
est observations are roughly equidistant from the center 
provides additional support for symmetry. The standard 
deviation is quite small relative to the mean and median. 
b. See the comments of (a). In addition, using 1.5(Q3 — Q1) 
as a yardstick, the two largest and three smallest observa- 
tions are mild outliers. 


xX = .9255, s = .0809, ¥ = .93, small amount of variabil- 
ity, slight bit of skewness. 


a. y= ax + bs, =a’s, 


a. The “five-number summaries” (X, the two fourths, and 
the smallest and largest observations) are identical and there 
are no outliers, so the three individual boxplots are identical. 
b. Differences in variability, nature of gaps, and existence 
of clusters for the three samples. 

c. No. Detail is lost. 


c. Representative depths are quite similar for the four types 
of soils—between 1.5 and 2. Data from the C and CL soils 
shows much more variability than for the other two types. 
The boxplots for the first three types show substantial positive 
skewness both in the middle 50% and overall. The boxplot for 
the SYCL soil shows negative skewness in the middle 50% and 
mild positive skewness overall. Finally, there are multiple outli- 
ers for the first three types of soils, including extreme outliers. 


a. Xv = (nx, a Xn4p/(n tr 1) 
G;, 12:53;.:532 
A substantial positive skew (assuming unimodality) 


a. All points fall on a 45° line. Points fall below a 45° line. 
b. Points fall well below a 45° line, indicating a substan- 
tial positive skew. 


Chapter 2 


1. 


a. & = {1324, 3124, 1342, 3142, 1423, 1432, 4123, 4132, 
2314, 2341, 3214, 3241, 2413, 2431, 4213, 4231} 
b. A = {1324, 1342, 1423, 1432} 


ce. B= {2314, 2341, 3214, 3241, 2413, 2431, 4213, 4231} 
d. AU B = {1324, 1342, 1423, 1432, 2314, 2341, 3214, 
3241, 2413, 2431, 4213, 4231}, 
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A-32 Answers to Selected Odd-Numbered Exercises 


A B contains no outcomes (A and B are disjoint), 47. a. .73 b. .22 c. .50; among all those who have a 
A’ = (3124, 3142, 4123, 4132, 2314, 2341, 3214, 3241, Visa card, 50% have MasterCards; .75 
2413, 2431, 4213, 4231} d. .40 e. .85 
3. a. A = {SSF, SFS, FSS} 49. a. .34,.40 b. 588 c. .50 
b. B = {SSF, SES, FSS, SSS} 51. a. .0312 b. .024 
c. C = {SFS, SSF, SSS} 
d. C’ = {FFF, FSF, FFS, FSS, SFF}, 53. .083 
A UC = {SSF, SES, FSS, SSS}, 55. .236 
ANC = {SSF, SFS}, 
59. a. 21 . A . .264, 462, .274 
BU C = {SSF, SES, FSS, SSS} = B, . Bese Se adbttee: 
BOC = {SSF, SES, SSS} = C 61. a. .578, .278, .144 b. 0, .457, .543 
5. a. §= {(1,1,1),(, 1,2), (1, 1,3), (, 2, 1), CL, 2, 2), CL, 2, 3), 63. b. .54 c. .68 d. .74 e. 7941 


(1,3, 1), A, 3, 2), (1, 3, 3), (2, 1, 1), 2, 1,2), (2, 1,3),2,2,, 65. .087, .652, 261 
(2, 2,2), 2, 2,3), (2,3, 1,2, 3,2) (23,3 G. 105812 Go gggaa0. 
G3, 1,3), 3,2, 1), 3, 2,2), B, 2,3), 3,3, 1), 3, 3, 2), (3, 3, 3)} a er ee 


b. {(, 1, 1, (2, 2, 2), (3, 3, 3)} ce. {(1, 2, 3), CL, 3, 2), 69. a. .126 b. .05 ce. 1125 
(2, 1, 3), (2,3, 1,3, 1,2),3,2,1)}  d& {01,1,1),0, 1,3), d. 2725 e. 5325.—Ssf. 2113 
(1, 3, 1), CL, 3, 3), G, 1, D, G, 1 3), G, 3, D, G, 3, 3)} 71. a. 300 b. .820 c. .146 
7. a. There are 35 outcomes in &% b. {AABABAB, 75. AOL, .722 
AABAABB, AAABBAB, AAABABB, AAAABBB} 
77. a. .00648 b. .00421 
11. a. .07 b. .30 c. .57 
79. .0059 
13. a. .36 b. .64 ec. .53 
d..47  e.17 £75 81. a. .95 
15. a. 572 b. .879 83. a. .10,.20 b. 0 
17. a. There are statistical software packages other than SPSS 85. a. p(2—p) b.1— (1 —p)" ce. (1 = p)* 
and SAS. d. .9 + (1 — p)(.1) 
‘ae ee e. .1(1 — p)3/L9 + (1 — p)?] = .0137 for p = .5 
19. a. 8841 —b. .0435 87. a. .40 b. 371 
c. No: .571 # .65, and also .40 # (.65)(.7) d. .733 
21. a. .10 b. .18, .19 ce. 41 d. .59 - 
e. 31 f. .69 89. [2a(1 — 7)]/(1 — 7”) 
23. a. 067 _—b. 400 c. 933d. .533 91, a, 333,444 bb 150, 291 
25. a. .85 b. 15 e2y “ao35 93. .45, 32 
27. a. 1 b. 7 c. 6 95. a. .0083 b. .2 ce 2 d. .1074 
29. a. 676; 1296 b. 17,576; 46,656 97. 905 
c. 456,976; 1,679,616 d. .942 99. a. .974 b. .9754 
31. a. 45 b. 1440 days (almost 4 years) 101. .926 
33. a. 1,816,214,440 b. 659,067,88 1,572,000 103. a. .008 b. .018 c. .601 
€. 9,072,000 105. a. .883,.117 b. 23 c. .156 
35. a. 38,760, 0048 —b. 0054. .9946 107.1 -(1-p)U—p) = G—p,) 
d. .2885 
109. a. .0417 b. .375 
37. a. 60 b. 10 c. .0456 
111. P(hire #1) = 6/24 fors = 0, = 11/24 fors = 1, = 10/24 
39. a. .145 b. 075 c. .264 d. 154 for s = 2, and = 6/24 for s = 3, so s = | is best. 
41. a. 10,000 b. .9876 ce. .03 d. .0337 113. 1/4 = P(A, NA, NA;) 
43. . 4, .00394, . O01 


45. a. .447, 500, .200 b. .400, .447 ce .211 
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Answers to Selected Odd-Numbered Exercises A-33 


Chapter 3 


19. 
21. 


23. 
25. 
27. 


29. 
31. 
33. 
35. 
37. 


39. 


eae (0), Oy: 6.45.0 


.a. 81 


. x = 0 for FFF; x = 1 for SFF, FSF, and FFS; x = 2 for 


SSF, SFS, and FSS; and x = 3 for SSS 


. Z = average of the two numbers, with possible values 2/2, 


3/2,..., 12/2; W = absolute value of the difference, with 
possible values 0, 1, 2, 3, 4, 5 


. No. In Example 3.4, let Y = 1 if at most three batteries are 


examined and let Y = 0 otherwise. Then Y has only two 
values. 


, 12} ; discrete ex {1 23. 3536 4..}3 
discrete e. {Z1, Za-++ » Zy}, discrete because there 
are only a finite number N of different sales tax percent- 
ages across the entire country g. {ximsx=M} 
where m (M) is the minimum (maximum) possible 
tension; continuous 


. a. {2,4,6,8,...}, that is, {2(1), 2(2), 2(3), 2(4),...}, an 


infinite sequence; discrete 
b. {2, 3, 4,5, 6,...}, thatis, {1 + 1,14 
4,...}, an infinite sequence; discrete 


. b. 55, .25 c. .70 
. a. .70 b. .45 c. .55 
d. .71 e. .65 f. .45 
a. (1, 2), (1, 3), CL, 4), C1, 5), (2; 3), (2, 4), (2, 5), 3, 4), 


(3, 5), (4, 5) b. p(0) = .3, p(1) = .6, p(2) = «1 

c. F(x) = 0 for x < 0, = .3 forO Sx < 1, = .9 for 
1s=x<2,and =1for2=x 

b. .162 ce. It is A; AUUUA, UAUUA, 
UUAUA, UUUAA; .00324 

p(0) = .09, p(1) = .40, p(2) = .32, p(3) = .19 

b. p(x) = .301, .176, .125, .097, .079, .067, .058, .051, 
.046 for x = 1, 2,...,9 

ce. F(x) = 0 for x < 1, = .301 for 1 Sx < 2, = .477 for 
2sx<3,...,= .954for8 =<x<9,=1forx=9 

d. .602, .301 

a. .20 b. .33 c. .78 d. 
a. p(y) = (1 — p)’: p for y = 0, 1, 2, 3,... 
a. 1234, 1243, 1324,..., 4321 

b. p(0) = 9/24, p(1) = 8/24, p(2) = 6/24, p(3) = 0, 
p(4) = 1/24 


53 


a. 6.45 b. 15.6475 c. 3.96 d. 15.6475 
4.49, 2.12, .68 

a. p b. pl-p) «ap 

E[h,(X)] = $4.93, E[h,(X)] = $5.33, so 4 copies is better. 


E(X) = (n + 1)/2, 
(n?2 — 1)/12 


2.3, .81, 88.5, 20.25 


E(X?) = (n + 1)(2n + 1)/6, V(X) = 


43. E(X — c) = E(X) — c, E(X — p) = 0 
47. a. 001 b. .001 ce. .147 d. .001 
e. 1.000 f. .001 
49. a. 354 b. 115 c. .918 
51. a. 6.25 b. 2.17 c. .030 
53. a. .403 b. .787 c. .774 
55. .1478 
57. .407, independence 
59. a. .017 b. .811, .425 c. .006, .902, .586 
61. When p = .9, the probability is .99 for A and .9963 


63. 
65. 
67. 


69. 


for B. If p = .5, these probabilities are .75 and .6875, 
respectively. 


The tabulation for p > .5 is unnecessary. 
a. 20, 16 b. 70, 21 


P(\X — p| = 2c) = .042 when p = .5 and = .065 when 
p = .75, compared to the upper bound of .25. Using 
k = 3 in place of k = 2, these probabilities are .002 and 
.004, respectively, whereas the upper bound is .11. 


.379, .879 b. .121 c. Use the binomial distri- 
bution with n = 15, p = .10 


71. a. h(x; 15, 10, 20) for x = 5,...,10 
b. .0325 c. .697 
73. a. h(x; 10, 10, 20) b. .033 ce. h(x; n,n, 2n) 
75. a. nb(x; 2, .2) b. .0768 ce. .1808 
d. 8, 10 
77. nb(x; 6, .5), 6 
79. a. .999 b. .184 c. .260 d. .080 
81. a. .011 b. .441 ce. .554, .459 
d. .945 
83. Poisson(5) a. .492 b. .133 
85. a. .122, .809, .283 b. 12, 3.464 
ce. .530, .011 
87. a. .099 b. .135 c. 2 
89. a. 4 b. .215 ec. At least —In(.1)/2 =~ 
1.1513 years 
91. a. .221 b. 6,800,000 c. p(x; 20.106) 
95. b. 3.114, .405, .636 
97. a. b(x; 15, .75) b. .686 
ce. .313 d. 11.25, 2.81 e. .310 
99, .991 
101. a. p(x; 2.5) b. .067 c. .109 
103. 1.813, 3.05 
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A-34 Answers to Selected Odd-Numbered Exercises 


105. p(2) =p’, p3) = (1 — p)p’, p4) = (1 — pip’, po) = ~—— 3. a. No b. .0273 
bes 2 = . 
[1 — p2) PO 3)I0 — p)p* for x=5,6, 7.3 115. be Sy + Spty Ce .25(y — ply)? + 5M, + By) 
99950841 d. .6 and .4 replace .5 and .5, respectively. 
107. a. .0029 b. .0767, .9702 17. pl isjst + Pip Ps where p,=0 if k<0 or 
al a 5 
109. a. .135 b. 00144 eS _ [ps 2)] k> 10. 
111. 3.590 121. a. 2.50 b. 3.1 
Chapter 4 
1. b. .4625, same ce. .5, .278125 45. 7.3% 
3. b. 5 c. .6875 d. .6328 47. 21.155 
5. a. .375 b. .125 c. .297 d. .578 49. a. .1190, .6969 b. .0021 c. .7019 
d. > 5020 or < 1844 (using z = 3.295) 
7. b. .309 c. .494 (wu = 2.225 by symmetry of the 000s 
density curve) d. .247 e. Normal, wp = 7.559, 0 = 1.061, .7019 
9. a. .451,.549 b. .312 51. .3174 for k = 1, .0456 for k = 2, .0026 for k = 3, as com- 
11. a. 25 b. 1875 c. 4375 d. 1.4142 pared to the bounds of 1, .25, and .111, respectively. 
e. fix) = x/2 forO<x<2 f. 1.33 53. a. Exact: .212, .577, .573; Approximate: .211, .567, .596 
g. .222,.471 h. 2 b. Exact: .885, .575, .017; Approximate: .885, .579, .012 
13. a. 3 Rr eee a el c. Exact: .002, .029, .617; Approximate: .003, .033, .599 
c. 125, .088 d. 1.5,.866 e. .924 55. a. 9409 b. .9943 
16.4. PR) S0tere 20,2002 =F) ieoeeei=q 27 b> Newmal, e250," = 126 
forx = 1 b. .0107 c. .0107, .0107 59. a. 1 b. 1 c. .982 d. .129 
d. .9036 e. 818,111 f. 3137 61. a. 480, .667,.187 —b. .050, 0 
17. a. A + (B— A)p b. E(X) = (A + B)/2, 63. a. short => plan #1 better, whereas long => plan #2 better 
oy = (B— A)/V12 b. 1/A = 10 = E[h,(X)] = 100, E[A,(X)] = 112.53 
ce. [B"t! — An*']/[(n + 1)(B — A)] 1/A = 15 > E[h,(X)] = 150, E[h,(X)] = 138.51 
19. a. 597 b. 369 65. a. 3.01, 12.44 b. .238 (.237 using software) 
ec. f(x) = .3466 — .25 In(x) forO <x <4 c. .176 
21. 314.79 67. a. .424 b. .567, ji < 24 
23. 248, 3.60 c. 60 d. 66 
25. b. 1.8(90th percentile for X) + 32 69. a. NA; b. Exponential with A = .05 
c. a(X percentile) + b c. Exponential with parameter nA 
27. 0, 1.814 73. a. .826, .826, .0636 b. .664 ec. 172.727 
29. a. 2.14 b. .81 c. 1.17 77. a. 123.97, 117.373 b. 5517 c. .1587 
d. .97 e, 2.41 79. a. 9.164, .385 —_b. .8790 —_c.-.4247, skewness 
31. a. 2.54 b. 1.34 c. —.42 d. No, since P(X < 17,000) = .9332 
81. a. 3.962, 1.921 b. .0375 c. .2795 
33. a. .9664 b. .2451 c. .8664 4.7.77 e. 13.74 £. 4.52 
35. a. .0455, .0455 b. Approximately 0 83.0 = 
c. 6460 d. 2.13 e. .1700 . r r : 7 
85. b. + B)- + +B+m)- ; 
37. a. 0,.5793, 5793 —b. 3174,no Pe BD eee eR Boe eT 
B/(a + B) 


ec. < 87.6 or > 120.4 


39. a. .1003, .1003 b. 35.226 
c. 21.888 d. 20.016, 39.984 89. Yes, because a normal probability plot shows a substantial 


linear pattern. 


87. Yes, since the pattern in the plot is quite linear. 


41. .002 
43. a. 58.31, 11.665 b. .4768 c. 1587 
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91. 
93. 


95. 


97. 


99. 


Yes 


Plot In(x) vs. z percentile. The pattern is straight, so a 
lognormal population distribution is plausible. 


The pattern in the plot is quite linear; it is very plausible 
that strength is normally distributed. 


There is substantial curvature in the plot. A is a scale 
parameter (as is o for the normal family). 


1 
a. F(y) =a — 3/18) for0 <y < 12 


b. .259, .5, .241 c. 6, 43.2, 7.2 
d. .518 e. 3.75 


101. a. fx) = x? for 0 Sx < 1 and = : - =x for 
Le75 3 b. .917 ec. 1.213 

103. a. .9162 b. .9549 ce. 1.3374 

105. .506 


107. 


11 
b. F(x) =0 for x<—1, = Gx —2°/3)/9 + for 
—-l1<sx=2,and=1forx>2 
c. No. F(0)< 55> p>0 


5) 
d. Y~Bin| 10, — 
(10.5) 


Answers to Selected Odd-Numbered Exercises A-35 
109. a. .368, .828, .460 b. 352.53 
c. 1/B - exp [—exp (—(x — a)/B)] - exp (—(@ — a)/B) 
d. a e. ww = 201.95, mode = 150, ps = 182.99 
111. a. b. No c. 0 d. (a — 1)B ev—2 
113. a. w = p/A, + (1 — p)/A, 
VX) = 2p/At + 2CL — p)/AZ — pw? 
b. p(l — exp (—A,x)) + (1 — p)(. — exp (—A,x)) 
forx=0 
c. .403 d. .879 
e 1,CV>1 f. CV<1 
115. a. Lognormal b. 1 ec. 2.72, .0185 
119. a. Exponential with A = 1 
c. Gamma with parameters a and cB 
121. a. (1/365)? b. (1/365) c. .000002145 
123. b. Let u,, U5, u3,... be a sequence of observations from a 
Unif[0, 1] distribution (a sequence of random numbers). 
Then with x; = (—.1)In(1 — u,), the x,’s are observations 
from an exponential distribution with A = 10. 
125. g(E(X)) S E(g(X)) 
127. a. 710, 84.423, .684 b. .376 


Chapter 5 


1. 


enw & 


11. 


13. 


15. 


17. 


19. 


21. 
23. 


a. .20 b. .42 c. At least one hose is in use at 
each pump; .70. d. py(x) = .16, .34, .50 for x = 0, 
1, 2, respectively; pj(y) = .24, .38, .38 for y=0, 1, 2, 
respectively; .50 e. No; p(0, 0) ¥ p,(0) « pO) 


a. .15 b. .40 CG. 22 d. .17, .46 

a. .054 b. .00018 

a. .030 b. .120 ce. .300 d. .380 e. Yes 
a. 3/380,000 b. .3024 ec. .3593 

d. 10Kx? + .05 for 20 = x = 30 e. No 

ae MT Bae yt py/xty! b. e7*~ #2- [1 + py + pol 
c. eH tH)» (a, + pty)”/m!; Poisson (wu, + p>) 


a. e*~)forx20,y20 b. .400 ec. 594 

d. .330 

a. Fy) =1-—e%+(1-e*’yP-(1-—e”» fory=0 
b. 2/30 

a. .25 b. .318 c. .637 

d. f(x) = 2V R? — x*/7R? for —R = x = R; no 

a. K(2 + y?)/(10Kx2 + 05); Ke + y2)/(10Ky? + .05) 
b. .556, .549 c. 25.37, 2.87 

a. lx), Xa X3)/Fy xO) dD. Fy, Xa» ¥3)/Fy, OD) 

sli5 


25. 
27. 


29. 


31. 
37. 


39. 


41. 


47. 


L? 
25 hr 
y) 
3 
a. —.1082 b. —.0131 
ack [25 325 40 45 525 65 
px) | 04 20 25 12 30 .09 
E(X) = p = 44.5 
bs? | O 112.5 312.5 800, 
p(s?) | 38 20 30 12 
E(S2) = 212.25 = o? 
This comes straight from the Bin(15, .8) distribution and 
Appendix Table A.1: 
WS: | 10 333 400 .467 ... .867 .933 1 
p(a/5) 1.000 ... .000 .001 .003 231 .132 .035 
ase | 1 1 2 Bs 2 695 4 
px) | 160 24025 200'—i«a00s—‘i(‘:*~*«zSd 
be 85 er of 0 1 2 3 
pr) 1.30 40 22 08 
a. .9876 —_b. .0009 
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A-36 Answers to Selected Odd-Numbered Exercises 


49. a. .6026 b. .2981 77. a. 3/81,250 b. f(x) = k(250x — 10x?) for 0 < x = 20 
51. 7720 and = k(450x — 30x? + 5x3) for 20 < x = 30; f(y) results 
from substituting y for x in f(x). They are not independent. 
53. a. .0062b. 0 c. 355 d. 25.969 —e. 204.6154, -.894 ft. 7.66 
55. a. .9838 b. .8926 79, =] 
57. 9616 81. a. 400 min _b. 70 
59. a. .9986, .9986 b. .9015, .3970 $3. 97 
c. .8357 d. .9525, .0003 
85. .9973 
61. a. 3.5, 2.27, 1.51 b. 15.4, 75.94, 8.71 
87. a. .2902 b. .8185 
63. a. .695 b. 4.0675 > 2.6775 c. The X distribution is much more concentrated about 13 
65. a. .9232 b. .9660 than is the population distribution. 
d. 0 


67. .1588 

69. a. 2400 b. 1205; independence ec. 2400, 41.77 
2 2 

71. a. 158, 430.25 _b. 9788 93. a. oy/(ow+ op) —b. .9999 


73. a. Approximately normal with mean = 105,SD = 1.2649; pon ees Te 
Approximately normal with mean = 100, SD = 1.0142 97. a. .6 b. U=pX+V1-p Y 
b. Approximately normal with mean = 5, SD = 1.6213 
c. .0068 d. .0010, yes 
75. a. .2,.5,.3 for x = 12, 15, 20; .10, .35, .55 for y = 12, 15, 20 
b. .25 c. No d. 33.35 e. 3.85 


Chapter 6 


91. b., c. Chi-squared with v = n. 


1a.814,X b..77,¥ — « 1.66,8 15. a. 6 = 2X?/2n ib. 74.505 
d. .148 e. .204, S/X 17. b. 444 
3. a. 1.348, X  b. 1.348, X c. 1.781, X + 1.285 19. a. p= 2 — 30 = .20 b. p= (100A — 9)/70 
d. .6736 . .0846 4 
‘ 21. b. & = 5, A = 28.0/T(1.2) 


5. Nx = 1,703,000; T— Nd = 1,591,300; T- (x/y) = 
1,601,438.281 


7. a. 120.6 b. 1,206,000 c. .80 d. 120.0 
9. a. 2.11 b. .119 


23. 0 = DX?/n 
25. a. 384.4,18.86  b.415.42 ce. .7967 


29. a. 9 = min (X), A = n/=[X, — min (X)] 
b. .64, .202 


ith Pa 4 a c. Use p, = x,/n, and g, = 1 - p, 33. With x, = time between birth i—1 and birth ij, 


My Ny A= 7 ; = -04 
in place of p; and q; in part (b) for i = 1, 2. 6/2, i ese, 
5. 29: 
d. —.245 e. .041 a a 

37. 1.0132 

Chapter 7 

1. a. 99.5% b. 85% c. 2.96 d. 1.15 7. By a factor of 4; the width is decreased by a factor of 5. 

3. a. Narrower b. No c. No d. No 9. a. (x — 1.6450/Vn, %); (4.57, ©) 

5. a. (4.52, 5.18) b. (4.12,5.00) 55d. 94 i C o/Vn,%) (9,8 + 24° o/ Vin): 

0. 59. 
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11. 
13. 
15. 
17. 
19. 
21. 
23. 
25. 
29. 


31. 


33. 


35. 


950, .8714 

a. (608.58, 699.74) b. 189 

a. 80% b. 98%  c. 75% 

134.53 

(.513, .615) 

Pp < .273 with 95% confidence; yes 

a. (.225, .275) b. 2655 

a. 381 b. 339 

a. 2.228 b. 2.086 c. 2.845 d. 2.680 
e. 2.485 f. 2.571 

a. 1.812 b. 1.753 c. 2.602 d. 3.747 
e. 2.1716 (from Minitab) f. Roughly 2.43 

a. Reasonable amount of symmetry, no outliers 
b. Yes (based on a normal probability plot) 

c. (430.5, 446.1), yes, no 

a. 95% CI: (23.1, 26.9) 

b. 95% PI: (17.2, 32.8), roughly 4 times as wide 


Answers to Selected Odd-Numbered Exercises A-37 


37. a. (.888, .964) b. (.752, 1.100) c. (.634, 1.218) 
39. a. Yes b. (6.45, 98.01) c. (18.63, 85.83) 
41. All 70%; (c), because it is shortest 

43. a. 18.307 b. 3.940 c. .95 d. .10 
45. (4.82, 21.85); no 

47. a. 95% CI: (6.702, 9.456) b. (.166, .410) 

49. (47.4, 83.4) 

51. a. (.260, .457) b. 2398 c. No—97.5% 
53. (—.84, —.16) 

55. 246 

57. (2t,/X} an, 2t,/ Xz, 2») = (65.3, 232.5) 


. a. (max (x,)/(1 — a/2)'”", 
b. (max (x,), max (x,)/a"”) 


. (73.6, 78.8) versus (75.1, 79.6) 


max (x;)/(@/2)""") 
c. (b); (4.2, 7.65) 


Chapter 8 


11. 


Yes b. No; x is the sample median, not a parameter 
. No; sis the sample standard deviation, not a parameter 
Yes 

No; X and ¥ are statistics, not parameters 

Yes 


a. Reject H, because .001 = .05 = a 

b. Reject Hy c. Don’t reject Hy because .078 > .05 
d 

e 


meno p 


. Reject H, (a close call) 
. Don’t reject Hy 


. Because this setup puts the burden of proof on the welds 


to show that they conform to specifications; only if there 
is compelling evidence for this will the welds be judged 
satisfactory. 


. Hy: o = .05 versus H,: 0 < .05. 


I: conclude variability in thickness is satisfactory when it 
is not. 

II: conclude variability in thickness is not satisfactory 
when in fact it is. 


. I: concluding that the plant isn’t in compliance when it is. 


II: concluding that the plant is in compliance when it is 
not. 


a. I: concluding that a majority favor one of the two com- 
panies when that is not the case. 

II: concluding that potential subscribers are evenly split 
between the two companies when they aren’t. 

b. x = 60rx = 19 


c. X ~ Bin(25, .5), so P-value = B(6:25,.5) + 

[1 — B(18;25, .5)] = .014 

d. Rejecting H) when P-value = .044 is equivalent to 
rejecting when either x = 7 or x = 18. Then B(.3) = 
P(8 =X = 17 whenp = .3) = B(17;25, .3) — B(7;25, .3) = 
488, B(.6) = .845, B(.7) = .488 


. a. Hy: w = 10 versus H,: w # 10 
b. P-value = P(X = 9.85 or = 10.15 when Hi is true) = 
@(—3.75) + [1 — &(3.75)] =~ 0 (software gives .00018). 
Since 0 = .01, H, should be rejected. The scale does not 
appear to be correctly calibrated. 
c. .5319, .0078 


15. a. .0778 b. 1841 c. .0250 
d. .0066 e. 5438 
17. a. P-value = P(X = 30,960 when Hy is true) = 1 — 


19. 


21. 


23. 


(2.56) = .0052 = .01, so reject Hy 
b. .8413 ce. .143 d. .0052 


a. z = —2.27, P-value = .0232 > .01, so don’t reject Hy 
b. .2266 c. 22 


a. z = —3.33, P-value = .0008 = .01, so reject Hy 
b. .1056 c. 217 


a. x = .750, x = .640, s = .3025, f, = .480. A boxplot 
shows substantial positive skew; there are no outliers. 

b. No. A normal probability plot shows substantial curva- 
ture. No, since n is large. 
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A-38 Answers to Selected Odd-Numbered Exercises 


c. 2 = —5.79, P-value ~ 0, reject H) at any reasonable 
significance level; yes. 
d. .821 

25. No. Hy: w = 2 versus H,: w < 2, z = —1.80, P-value = 


.0359 > .01, so don’t reject Hy 


29. a. P-value = .136 > .05, so don’t reject Hy 

b. .136 > .05, so don’t reject Hy 

c. P-value = .016 > .01, so don’t reject Hy 

d. P-value ~ 0 = .05, so reject Hp in favor of H,: w #5 
31. a. P-value = .003 = .05, so reject Hy 

b. P-value = .057 > .01, so don’t reject Hy 

c. P-value > .5 > a for any reasonable a, so don’t reject Hy 
33. a. It appears that the specification has been violated; much 


35. 


37. 


39. 


41. 
43. 


45. 
47. 


of the boxplot lies to the right of 200. 
b. t = 5.8, P-value ~ 0, so conclude that ~ # 200 


a. t ~ 1.2, P-value ~ .128 > .05, so Hy: w = 200 cannot 
be rejected 
b. .30 (from software) 


a. Yes, because the pattern in a normal probability plot is 
reasonably linear. 

b. P-value > .10 (barely), so Hy: = 100 should not be 
rejected at any sensible a, and the concrete should be used. 


a. t = 2.43, P-value ~ .013 > .01 = a, so there is not 
compelling evidence. 
b. Yes, type II c. .66 (from software) 


t ~ 1.9, so P-value ~ .116. H, should not be rejected. 


a. Hy: p = .2 versus H,: p > .2, z = 1.27, P-value = .10 > 
.05, so there is not compelling evidence for rejecting Hp. 
b. I: say that more than 20% are obese when this is not 
the case; II: conclude that 20% are obese when the actual 
percentage exceeds 20%. 

ce .121 


z = 3.67, P-value ~ 0, so reject Hy: p = .40. No 


a. z = —1.0, so there is not enough evidence to conclude 
that p < .25; thus, use screwtops. 


49. 


b. I: Don’t use screwtops when their use is justified; 
II: Use screwtops when their use is not justified. 


a. z = 3.07, P-value = .0022 = .01, so reject Hy and the 
company’s premise. 
b. .0332 


51. No, no, yes. a = .098, B = .090 
53. a. .8888, .1587, .0006 b. P-value ~ 0. Yes 
c. No 
55. a. .049, .096 b. 69 
57. z = —3.12, P-value = .0018 = .05, so conclude that uw # 


59. 


61. 


63. 


3.20 


a. Hy: w = .85 versus H,: w # .85 
b. H, cannot be rejected at either a 


a. No, because P-value = .02 > .01; yes, because 45.31 
greatly exceeds 20, but n is very small. 
b. B = .57 (software) 


a. No, no 
b. No, because z = .44 and P-value = .33 > .10. 


65. a. Approximately .6; approximately .2 (from Appendix 
Table A.17) 
b. n = 28 

67. a. z = 1.64, P-value ~ .1 > .05, so H, cannot be rejected. 
Type II 
b. Yes. 

69. Yes, z = —3.32, P-value = .0005 = .001, so Hy should be 


71. 
73. 
75. 


77. 
79. 


rejected. 
No, since z = 1.33 and P-value = .0918 
No, since P-value = .2296 


z = .92, P-value = .1788, so it cannot be concluded that 
p> 20. 


01 < P-value < .025, so do not reject Hy; no contradiction 


a. Test statistic is y? = 22X;/ 19, P-value = area under the 
x3, curve to the left of the calculated y’. 
b. x? = 19.65, P-value > .10, so Hp: pp = 75 can’t be rejected. 


Chapter 9 


1 
3 


eo ~ 


11 


. a. —.Ahr; it doesn’t b. .0724, .2691 c. No 


. a. Yes; P-value ~ 0 = .01, so reject Hp. b. P-value = 
.0132, so yes at significance level .05 but no at level .01. 


. a. z= —2.90, P-value = .0019, so reject Ho. 
b. .8212 c. 66 


. No, since P-value for a 2-tailed test is .0602. 

. a. 6.2; yes b. z = 1.14, P-value ~ .25, no 
c. No d. A 95% Cl is (10.0, 21.8). 

- A 95% C1 is (.99, 2.41). 
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13. 50 

15. b. It increases. 

17. a. 17 b. 21 ce. 18 d. 26 

19. tf = —1.20, P-value = .196, so do not reject Hp. 

21. No; t = —2.46, df = 15, P-value ~ .013, so do not reject 
Hi, (a close call). 

23. b. No c. t = —.38, P-value ~ .7, so don’t reject Hp. 


25 


27. 


29. 


31. 


33. 
35. 


37. 
39. 


41. 


. a. Both normal probability plots exhibit substantial linear 
patterns. 
b. Average price for the = 93 wines appears to signifi- 
cantly exceed that for the = 89 wines. 
c. (16.1, 82.0); No, because this CI does not include 0. 


a. 99% CI: (.33, .71) b. 99% CI: (—.07, .41), so 0 is 
a plausible value of the difference. 


t= —2.10, df = 25, P-value = .023. At significance level 
.05, we would conclude that cola results in a higher aver- 
age strength, but not at significance level .01. 


a. Virtually identical centers, substantially more variability 
in medium range observations than in higher range obser- 
vations 

b. (—7.9, 9.6), based on 23 df; no 

No, t = 1.33, P-value = .094, don’t reject Hy 

t= —2.2, df = 16, P-value = .021 > .01 =a, so don’t 
reject Hp. 

a. (—.561, —.287) 
a. Yes 


b. ¢ = 2.7, P-value = .018 < .05 =a, so Hy should be 
rejected. 


a. (—3.85, 11.35) b. Yes. Since P-value = .02, at 
level .05 there would appear to be an increase, but not at 


b. Between —1.224 and .376 


level .01. ce. (7.02, 10.06) 
43. a. No b. —49.1 c. 49.1 
45. a. Yes, because of the linear pattern in a normal 


47. 


49. 


51. 


53 


probability plot. 
ent samples 
same conclusion. 


a. 95% CI: (—2.52, 1.05); plausible that they are identical 
b. Linear pattern in npp implies normality of difference 
distribution is plausible. 


z = 2.84, P-value = .0023 = .05, so H, can be rejected; 
the introduction of context appears to lower the correct 
response rate. 


b. No, data is paired, not independ- 
c. t = 3.66, P-value = .001 (not .003), 


z = 3.20, P-value = .0007, so Hj can be rejected at signifi- 
cance level .05, level .01, or even level .001. There does 
appear to be a nocebo effect. 


. a. Z = .80, P-value > .05,sodon’treject H,. b. n = 1211 


55 


Answers to Selected Odd-Numbered Exercises A-39 


.a. The CI for In(@) is In(6) + Zyl(m — x)/(mx) + 
(n — y)/(ny)]'?. Taking the antilogs of the lower and 


upper limits gives a CI for 6 itself. 
b. (1.43, 2.31); aspirin appears to be beneficial. 


57. (—.35, .07) 

59. a. 3.69 b. 4.82 ec. .207 d. .271 
e. 4.30 £212 g. .95 h. .94 

61. f = .384, P-value > .10, so, don’t reject Hp. 


63. 


65. 


67. 


69. 


71. 
73. 


75. 
77. 


79. 


81. 


83. 


85. 
87. 
89. 
91. 
93. 


95. 


f= 2.85 = F 05 19,2) ~ 2.08. Thus P-value < .05, so reject 
H; there does appear to be more variability in low-dose 
weight gain. 

(83Fy —an/84p 83F yp/87); (023, 1.99) 

No. t=3.2, df= 15, P-value = .006, 
Ay: by — by = O using either a = .05 or .01. 


so reject 


z>0 = P-value > .5, so Hp: p; 
rejected. 


(—299.3, 1517.9) 


They appear to differ, since df = 14, f = 
P-value = 0. 


Yes, f = —2.25, df = 57, P-value ~ .028. 


a. No. t = —2.84, df = 18, P-value ~ .012 
b. No. t = —.56, P-value ~ .29 


t = 3.9, P-value = .004, so Hp is rejected at level .05 or 
O01. 


No, nor should the two-sample ¢ test be used, because a 
normal probability plot suggests that the good-visibility 
distribution is not normal. 


Unpooled: df = 15, t = —1.8, P-value ~ .092 
Pooled: df = 24, tf = —1.9, P-value ~ .070 


a. m= 141,n=47 b.m= 240,n = 160 
No, z = .83, P-value ~ .20 

.9015, .8264, .0294, .0000; true average IQs; no 
Yes; z = 4.2, P-value ~ 0 


a. Yes. t = —6.4, df = 57, and P-value ~ 0 
b. t = 1.1, P-value = .14, so don’t reject Hp. 


(—1.29, —.59) 


— p,=0 cannot be 


=5.19, 


Chapter 10 


1 


3 


5 


~f = 2.44, Fosais = 3.06, Figais = 2-36. Thus .05 < 
P-value < .10, so Hy should not be rejected. 

» f = 1.30 < 2.57 = F499, 80 P-value > .10. Hy cannot 
be rejected at any reasonable significance level. 


« f = 1.73 <2.51 = F927, 80 P-value > .10 and the three 
grades don’t appear to differ. 


7 


9 


. f = 51.3, P-value = 0, so Hy can be rejected at any rea- 
sonable significance level. 

» f = 3.96 and Fos 3 99 = 3.10 < 3.96 < 4.94 = Fy, 3.59, 80 
.O1 < P-value < .05. Thus H, can be rejected at signifi- 
cance level .05; there appear to be differences among the 
grains. 
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A-40 Answers to Selected Odd-Numbered Exercises 


11. 


13. 


15. 
17. 
19. 
21. 


23. 


25. 


w = 36.09 3 1 + 2 5 
437.5 462.0 469.3 512.8 532.1 


Brands 2 and 5 don’t appear to differ, nor does there 
appear to be any difference between brands 1, 3, and 4, but 
each brand in the first group appears to differ significantly 
from all brands in the second group. 


3 1 + 2 5 
427.5 462.0 469.3 502.8 532.1 
14.18 17.94 18.00 18.00 25.74 27.67 
(—.029, .379) 

Any value of SSE between 422.16 and 431.88 will work. 


a. f = 22.6, F 91. 5,73 ~ 4-6, P-value < .001, so reject Hp. 
b. (—99.16, —35.64), (29.34, 94.16) 


1 2 3 4 
1 - 2.88 £5.81 7.43 + 5.81 12.78 + 5.48 
2 - = 4.55 + 6.13 9.90 + 5.81 
30 = - - 5.35 + 5.81 
AO ce 2 = _ 
4 3 2 1 
a. Normal, equal variances 


b. SSTr = 8.33, SSE = 77.79, f = 1.7, Hy should not be 
rejected (P-value > .10) 


27. 


31. 
33. 
35. 


37. 


39. 
41. 


43. 


45. 


a. f = 3.75, .01 < P-value < .05, so at significance level 

.05 brands appear to differ. 

b. Normality is quite plausible (a normal probability plot 

of the residuals x,; — x;, shows a linear pattern). 

ec. 4 3 2 1 Only brands 1 and 4 appear to differ 
significantly. 


Approximately .62 
arcsin (V'x/n) 


a. .01 < P-value < .05, so A) is not rejected. 
b. .029 > .01, so again H, is not rejected. 


f = 8.44 > 6.49 = Fo), so P-value < .001 and Hy should 


be rejected. 


5 3 1 4 2 


The Cl is (—.144, .474), which does include 0. 


2.92 < f= 3.96 < 4.07, so .05 < P-value < .10 and 
Ho: oi = 0 cannot be rejected. 


(—3.70, 1.04), (—4.83, —.33), (—3.77, 1.27), (—3.99, .15). 
Only «4, — 3 among these four contrasts appears to differ 
significantly from zero. 


They are identical. 


Chapter 11 


1. 


3. 


a. f, = 7.16, .01 < P-value < .05, so reject Ho,. 
b. f, = 10.42, .01 < P-value < .05, so reject Hp,. 


a. f, = 18.77, P-value = .023, f, = 21.10, P-value = .016 
(from software), so reject both Hj, and Ho, at significance 
level .05. 

b. Qo5.43 = 6.825, w = .257; .201 .324 .462 .602 


» fy = 2.56, F132 = 5.95, so there appears to be no effect 


due to angle of pull. 


. a. SSA = 22.889, SSB = 27.556, SSE = 5.111, fy = 


8.96, .01 < P-value < .05 (software gives .033), so at 
level .05 there does appear to be a brand effect. 

b. fy = 10.78, P-value = .024, blocking does appear to 
have been effective. 


. Source df SS MS f F o5 
Treatments 3 81.19 27.06 22.4 3.01 
Blocks 8 66.50 8.31 
Error 24 29.06 1.21 
Total 35 176.75 

1 4 3 2, 
8.56 9.22 10.78 12.44 
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11. 


13. 


15. 
17. 


19. 


21. 


A normal probability plot of the residuals shows a substan- 
tial linear pattern. There is no discernible pattern in a plot 
of the residuals versus the fitted values. 


b. Each SS is multiplied by c’*, but f, and f, are 
unchanged. 


a. Approximately .20, .43 b. Approximately .30 


a. fy = 3.76, fg = 6.82, fxg = .74, and F579 = 4.26, so 
the amount of carbon fiber addition appears significant. 
b. f, = 6.54, fh = 5.33, faa = 27 


fag = 3.18. Since Fo; 1913 = 3.51, P-value > .01 for test- 
ing H,4,; hence the interaction effect is not significant. 
J, = .94, P-value > .10; fy = 9.17, P-value < .001. Thus 
type of farm doesn’t seem to matter but maintenance 
method does. 


a, b. Source df SS MS Ft 
A 2 22,941.80 11,470.90 22.98 
B 4 22,765.53 5691.38 5.60 
AB 8 3993.87 499.23 49 
Error 15 15,253.50 1016.90 
Total 29 = 64,954.70 


HA, and Ho, are both rejected. 


Answers to Selected Odd-Numbered Exercises A-41 


23. Source df SS MS f Error 30 44.26 1.48 
MSA Total 48 168.07 
A 2 11,573.38 5786.69 MSAB ~ 26.70 Fo5630 = 2-42, fc = 61, so Apc is not rejected. 
MSB 35. Source df SS MS f 
B 4 17,930.09 4482.52 MSE = 28.51 ‘A 4 28.88 799 10.7 
B 4 23.70 5.93 8.79 
MSAB 
AB 8 1734.17 216.77 = 1.38 Cc 4 62 155 <1 
MSE Error 12 8.10 675 


Total 44 35,954.31 


ince F o<4 15 = 3.26, both A and B are significant. 
Since Fo, g3 = 3-17, Foirg = 8.65, and Fo, 439 = 4.02, Sinte T iaage 2-20, DORA and Sate Memnean 


Hg is not rejected but both H,, and Ho, are rejected. 37. Source df MS f 
25. (—.373, ~.033) A 2 2207.329 2259* 
27. a. Source df SS MS f Fos B 1 47.255 48.4% 
Cc 2 491.783 503* 
A 2 14,144.44 7072.22 = 61.06 3.35 D 1 044 <1 
B 2 5511.27 2755.64 23.79 3.35 AB 2 15.303 15.7* 
c 2 244,696.39 122,348.20 1056.27 3.35 AC 4 275.446 282% 
AB + 1069.62 267.41 2.31 2.73 AD 2 470 <1 
AC 4 62.67 15.67 14 2.73 BC 2 2.141 2.19 
BC 4 331.67 82.92 12 2.73 BD 1 273 <1 
ABC 8 1080.77 135.10 1.17 2.31 CD 2 247 <1 
Error 27 3127.50 115.83 ABC 4 3.714 3.80 
Total 53 270,024.33 ABD 2 4.072 4.17* 
d. Qo5327 = 3.51, w = 8.90, and all three of the levels dif- ACD 4 167 <1 
fer significantly from one another. BCD 2 280 zi 
29. a. fipc = 2.87, P-value = .029 for testing Hy,,-. However, ABCD 4 347 <i 
all two-factor interaction F ratios are highly significant. Error 36 971 
Total 71 93.621 
31. Source DF SS MS F P 


*Denotes a significant F ratio. 


A 2 124.60 62.30 4.85 0.042 39 eo de sean 

B 2 2061 10.30 0.80 0.481 Eye rene Ta ieee y gh ee 

C 2 356.95 17847 13.89 0.002 ” neces 

Source Contrast MS if 

A*B 4 57.49 14.37 1.12 0.412 

A*C 4 61.39 15.35 1.19 0.383 A 1307 71,177.04 436.7 

B*C 4 11.06 2.76 0.22 0.923 B 1305 70,959.34 435.4 

Error 8 102.78 12.85 Cc 529 11,660.04 71.54 

Total 26 734.87 AB 199 1650.04 10.12 

AC =53 117.04 <1 

a. The P-values in the foregoing ANOVA table for the AB, BC 57 135.38 <] 

AC, and BC effects all considerably exceed .1, indicating ABC 27 30.38 <] 

that at any reasonable significance level, the hypotheses of Error 162.98 

no two-factor interactions cannot be rejected. 

b. At significance level .05, the Power main effect is sig- 41. Source DF SS Ms F P 

nificant (somewhat of a close call) and the Paste Thickness A 1 0.003906 0.003906 25.00 0.001 

main effect is highly significant. B 1 0.242556 0.242556 =1552.36 0.000 

c. w = 4.83. The sample means for the three levels of C 1 0.178506 0.178506 1142.44 0.000 

paste thickness are 29.562 (for .4), 35.183 (for .3), and A*B 1 0.003906 0.003906 25.00 0.001 

38.356 (for .2). So the level .4 can be judged significantly AC 1 0.002256 0.002256 14.44 0.005 

different from the other two levels. BEC 1 0.178506 0.178506 1142.44 0.000 
33. Source df ss MS F A*B*EC 1 0.002256 0.002256 14.44 0.005 

Error 8 0.001250 0.000156 

A 6 67.32 11.02 Total 15 0.613144 

B 6 51.06 8.51 All effects are significant, and in particular the three-factor 

Cc 6 5.43 91 61 


interaction effect is significant. 
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A-42_ Answers to Selected Odd-Numbered Exercises 


43. Source df SS f 51. Source df SS MS f 
: ! 436 =! Amaineffects 1 322.667 322.667 980.38 
. : 099 =I Bmaineffects 3 35.623. 11.874. 36.08 
Cc ! 109 =! Interaction 3 8.557 2.852 8.67 
2p ! ae 851 Error 16 5.266 0.329 
on / 003 <1 Total 23 372.113 
AC 1 078 <1 F 053.16 = 3-24, so interactions appear to be present. 
a : oF i 53 Saubee df SS MS f 
BC 1 1.404 3.62 ° 
jon 1 ea ne A 1 30.25 30.25 6.72 
od ; sale wae B 1 144.00 144.00 32.00 
ior BO rn Cc 1 12.25 12.25 272 
F515 = 6.61, so only the factor D main effect is judged AB 1 1122.25 1122.25 249.39 
nenincant, AC 1 1.00 1.00 pe 
45. a. 1: (1), ab, cd, abcd; 2: a, b, acd, bed; 3: c, d, abc, abd; BC 1 12.25 12.25 2.72 
4: ac, be, ad, bd. ABC 1 16.00 16.00 3.56 
Error 4 36.00 4.50 
b. Source df SS f Total 7 
Only the main effect for B and the AB interaction effect 
A 12,403,125 27.18 are significant at a = .01. 
i a aa 55. a. &, = 9.00, B, = 2.25, 8, = 17.00,7, = 21.00, 
D 1 60.500 0.13 (@B),, = 0, (a8); = 2.00, (ay); = 2.75, 
AC l 10.125 0.02 (B5)\; = .75, (BY). = -50, (6 y)) = 4.50 
AD I 91.125 0.20 b. A normal probability plot suggests that the A, C, and 
BC I 50.000 0.11 D main effects are quite important, and perhaps the CD 
BD l 420.500 0.92 interaction. In fact, pooling the 4 three-factor interaction 
ABC I 3.125 0.01 SS’s and the four-factor interaction SS to obtain an SSE 
ABD I 0.500 0.00 based on 5 df and then constructing an ANOVA table sug- 
ACD I 200.000 0.44 gests that these are the most important effects. 
BCD 1 2.000 0.00 57. Source DF SS MS F P 
Blocks d 898.875 0.28 A 2 34436 17218 436.92 0.000 
Error 12 5475.750 B 2 105793 52897 1342.30 ~—:0.000 
Total 31 111,853.875 6 2 516398 258199 6552.04 0.000 
F o11,12 = 9.33, so only the A and B main effects are A*B 4 6868 1717 43.57. 0.000 
significant. A*C 4 10922 2731 69.29 0.000 
47. a. ABFG; (1), ab, cd, ce, de, fg, acf, adf, adg, aef, acg, aeg, BEC 4 10178 2545 64.57 0.000 
beg, bef, bdf, bdg, bef, beg, abcd, abce, abde, abfg, cdfg, cefg, A*B*C 8 6713 839 21.30 0.000 
defg, acdef, acdeg, bcdef, bcdeg, abcdfg, abcefg, abdefg. {A, Error 27 1064 39 


BCDE, ACDEFG, BFG}, {B, ACDE, BCDEFG, AFG}, {C, 
ABDE, DEFG, ABCFG}, {D, ABCE, CEFG, ABDFG}, {E, 
ABCD, CDFG, ABEFG}, {F, ABCDEF, CDEG, ABG}, {G, 59. Based on the P-values in the ANOVA table, statistically 


Obviously all effects are highly significant. 


ABCDEG, CDEF, ABF}. b. 1: (1), aef, beg, abcd, abfg, significant factors at the level a = .01 are adhesive type 
cdfg, acdeg, bcdef: 2: ab, cd, fg, aeg, bef, acdef, bedeg, and cure time. The conductor material does not have a 
abcdfg; 3: de, ace, adf, bef, bdg, abce, cefg, abdefe; 4: ce, acf, statistically significant effect on bond strength. There are 
adg, bcg, bdf, abde, defg, abcefe. - no significant interactions. 
49. SSA = 2.250, SSB = 7.840, SSC = .360, SSD = 52.563, 61. Source df SS mS f 
SSE = 10.240, SSAB = 1.563, SSAC = 7.563, SSAD = 
.090, SSAE = 4.203, SSBC = 2.103, SSBD = .010, 7 ; ee ee ea 
SSBE = .123, SSCD = .010, SSCE = .063, SSDE = C 4 2867.76 716.94 5.958 
4.840. Error SS = sum of two-factor SS’s = 20.568, Error D 4 5536.56 1384. 14 11.502 
eee F 911,19 = 10.04, so only the D main effect is Error 8 962.72 120.34 Fog4g = 3.84 
. Total 24 
HA, and Ho, cannot be rejected, while Hy. and Hop are 
rejected. 
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Answers to Selected Odd-Numbered Exercises A-43 


Chapter 12 


1. a. The accompanying displays are based on repeating 
each stem value five times (once for leaves 0 and 1, a 
second time for leaves 2 and 3, etc.). 

17 | 0 

17 | 23 

17 | 445 

17 | 67 

17 stem: hundreds and tens 

18 | 0000011 leaf: ones 

18 | 2222 

18 | 445 

18 | 6 

18 | 8 

There are no outliers, no significant gaps, and the 
distribution is roughly bell-shaped with a reason- 
ably high degree of concentration about its center at 
approximately 180. 

0 | 889 

1 | 0000 

1 | 3 

1 | 4444 

1 | 66 

1 8889 stem: ones 

2 11 leaf: tenths 

2 

2: || 5 

2 | 6 

2 

3 | 00 

A typical value is about 1.6, and there is a reasonable 
amount of dispersion about this value. The distribution is 
somewhat skewed toward large values, the two largest of 
which may be candidates for outliers. 

b. No, because observations with identical x values have 
different y values. 

c. No, because the points don’t appear to fall at all close 
to a line or simple curve. 

3. Yes. Yes. 

5. b. Yes. 

c. There appears to be an approximate quadratic relation- 
ship (points fall close to a parabola). 

7. a. 5050 b. 1.3 c. 130 d. —130 

. a. .095 b. —.475 c. 830, 1.305 
d. .4207, .3446 e. .0036 

11. a. —.01, —.10 b. 3.00, 2.50 
c. .3627 d. .4641 

13. a. Yes, because 7* = .972. 


15. 


17. 


19. 


21. 
23. 


27. 
29. 


31. 
33. 
35. 


37. 


39. 
43. 
45. 


a. 2|9 
3 | 335566677889 
4\/ 122356689 
5° | 1 
6 | 29 
7\9 
810 


Typical value in low 40s, reasonable amount of variability, 
positive skewness, two potential outliers 

b. No 

ce. y =3.2925 + .10748x =7.59. No; danger of extrapolation 
d. 18.736, 71.605, .738, yes 


a. 118.91 — .905x; yes. b. We estimate that the 
expected decrease in porosity associated with a 1-pcf 
increase in unit weight is .905%. c. Negative prediction, 
but y can’t be negative. d. —.52,.49 e. .938, roughly 
the size of a typical deviation from the estimated regression 
line. f. .974 


a. y = —45.5519 + 1.7114x b. 339.51 

ce. —85.57 d. The ¥,’s are 125.6, 168.4, 168.4, 211.1, 
211.1, 296.7, 296.7, 382.3, 382.3, 467.9, 467.9, 553.4, 
639.0, 639.0; a 45° line through (0, 0). 

a. Yes; 7? = .985 b. 368.89 c. 368.89 


a. 16,213.64; 16,205.45 
b. 414,235.71; yes, since 7? = .961. 


B, = Bx,¥,/ 2x} 

Data set r s Most effective: set 3 
1 43 4.03 Least effective: set | 
2 99 4.03 
3 99 1.90 

a. .893 b. .01837 c. (—.216, —.136) 

a. (.081, .133) b. H,: B, > .1, P-value = .277, no 

a. (.63, 2.44) is a 95% CI. 

b. Yes. t ~ 3.6, P-value ~ .004 

c. No; extrapolation 

d. (.54, 2.82), no 

a. Yes. t = 7.99, P-value ~ 0. [Note: There is one mild 


outlier, so the resulting normal probability plot is not 
entirely satisfactory.] 

b. Yes. t = —5.8, P-value ~ 0, so reject Hy: B,; = 1 in 
favor of H,: B, <1 


f= 71.97, sa, = .004837, t = 8.48, P-value = .000 
d = 1.20, df = 13, and B ~ .1. 


a. (77.80, 78.38) 

b. (76.90, 79.28), same center but wider 
c. wider, since 115 is farther from x 

d. t = —11, P-value = 0 
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A-44 Answers to Selected Odd-Numbered Exercises 


47. 


49. 
51. 


53. 


57. 


59. 


61. 


63. 


65. 


a. 95% PI is (20.21, 43.69), no 
b. (28.53, 51.92), at least 90% 


(431.3, 628.5) 


a. 45 is closer to x = 45.18 
c. (47.56, 49.84) 


b. (46.28, 46.78) 


(a) narrower than (b), (c) narrower than (d), (a) narrower 
than (c), (b) narrower than (d) 


If, for example, 18 is the minimum age of eligibility, then 
for most people y ~ x — 18. 


a. .966 

b. The percent dry fiber weight for the first specimen 
tends to be larger than for the second. 

c. No change d. 93.3% 

e. t = 14.9, P-value ~ 0, so there does appear to be such 
a relationship. 


a. r= .748, t = 3.9, P-value = .001. Using either a = .05 


or .01, yes. 
b. .560 (56%), same 


r = .773, yet t = 2.44, P-value ~ .07; so Hp: p = 0 cannot 
be rejected. 


a. 481 


b. ¢t = 1.98, P-value = .07, so at level .01, no linear asso- 
ciation. c. At level .01, no positive linear association, 
but at level .05, there does appear to be positive linear 
association. 


67. 


69. 


71. 


73. 


75. 


77. 


81. 
87. 


a. Reject H, 

b. No. P-value = .00032 => z ~ 3.6 => r= .16, which 
indicates only a weak relationship. 

ce. Yes, but very large n => p ~ .022, so no practical 
significance. 


a. 95% CI: (.888, 1.086) 

. 95% CI: (47.730, 49.172) 

. 95% PI: (45.378, 51.524) 

. Narrower for x = 25, since 25 is closer to x 
981 


a. 16.0593, .1925 b. t = 54.15, P-value = 0 
c. x = .408, and .2 is farther from this than is .4. 
d. (6.41, 6.82) e. (5.96, 7.27) 


a. 507 b. .712 c. P-value = .0013 < .01 =a, 
so reject Hy: 8B, = 0 and conclude that there is a useful 
linear relationship. d. A 95% Cl is (1.056, 1.275). 

e. 1.0143, .2143 


a. y = 1.69 + .0805x b. y = —20.40 + 12.2254x 
c. .984 for both regressions. 


ceo 


a. Yes, the points fall very close to a straight line. 
b. .996 ce. Yes; t = 54.6, P-value = 0 

d. 95% PI: (3.17, 4.57) 

e. t = 54.6, P-value = 0 

b. .573 


t = —1.14, so it is plausible that B, = y,. 


Chapter 13 


1. 


. a. Yes. 


. a. .776 


a. 6.32, 8.37, 8.94, 8.37, and 6.32 b. 7.87, 8.49, 8.83, 
8.94, and 2.83 c. The deviation is likely to be much 
smaller for the x values of part (b). 


bs. —.31,-=.31,...48;..1.23,.—1.15, :35,. —,10, 
—1.39, .82, —.16, .62, .09, 1.17, —1.50, .96, .02, .65, 
—2.16, —.79, 1.74. Here e/e* ranges between .57 and .65, 
so e* is close to e/s. cc. No. 


. a. About 98% of observed variation in thickness is 


explained by the relationship. 
b. A nonlinear relationship 


b. Perhaps not, because of curvature. 

c. Substantial curvature rather than a linear pattern, 
implying inadequacy of the linear model. A parabola 
(quadratic regression) provides a significantly better fit. 


. For set 1, simple linear regression is appropriate. A quad- 


ratic regression is reasonable for set 2. In set 3, (13, 12.74) 
appears very inconsistent with the remaining data. The 
estimated slope for set 4 depends largely on the single 
observation (19, 12.5), and evidence for a linear relation- 
ship is not compelling. 
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11. 
13. 
15. 


17. 


19. 


c. 40) increases, and V(Y; — Y) decreases. 
t with n — 2 df; .02 


a. A curved pattern b. A linear pattern 

c. Y=ax?-e d. A 95% PI is (3.06, 6.50). 

e. One standardized residual, corresponding to the third 
observation, is a bit large. There are only two positive 
standardized residuals, but two others are essentially 0. 
The patterns in a standardized residual plot and normal 
probability plot are marginally acceptable. 


a. Xx! = 15.501, Dy} = 13.352, D(x')? = 20.228, 
=x'y', = 18.109, 2(y/)? = 16.572, B, = 1.254, 


By = —-468, & = .626, B = 1.254 et = —1.02, so don’t 
reject Hp. d. Hy: B = 1, t = 3.28, so reject Hp. 


a. No b. Y’ =£,+ B, - (1/0) + €', where Y’ = In(Y), 


so Y=ae®"-€. c B=, = 3735.45, By = —10.2045 
& = (3.70034) - (10-5), 3’ = 6.7748, } = 875.5 


d. SSE = 1.39587, SSPE = 1.36594 (using transformed 
values), f = .33, P-value > .1, so don’t reject Hp. 


21 


23. 


25. 


27. 


29. 


31. 


33. 


35. 
37. 


39. 


41. 


43. 


45. 


47. 


49. 


51 


. a. Parabolic opening downward 
b. Very close to 1 
c. (83.89, 87.33) 


For the exponential model, V(Y|x) = a?e??*o, which does 
depend on x. A similar result holds for the power model. 


The z ratio for B, is highly significant, indicating that the like- 
lihood of a level being acceptable does decrease as the level 
increases. We estimate that for each 1 dBA increase in noise 
level, the odds of acceptability decreases by a factor of .70. 


b. 52.88, .12 c. .895 d. No 

e. (48.54, 57.22) f. (42.85, 62.91) 

a. SSE = 16.8, s = 2.048 b. R? = .995 c. Yes. 
t = —6.55, P-value = .003 (from Minitab) d. 98% 


individual confidence levels => joint confidence level = 
96%: (.671, 3.706), (—.00498, —.00135) 
e. (69.531, 76.186), (66.271, 79.446), using software 


a. .980 

b. .747, much less than .977 for the cubic model. 
c. Yes, since t = 14.18, P-value = 0. 

d. (6.31, 6.57), (6.06, 6.81) 

e. t = —5.6, P-value = 0 

a 


. .9671, .9407 

b. .0000492x? — .000446058x? + .007290688x + 
96034944 c. t= 2 < 3.182 = f¢53, so the cubic term 
should be deleted. d. Identical 

e. .987, .994, yes 


3 = 7.6883¢e:1799x— 0022.7 


a. 4.9 b. When number of deliveries is held fixed, 
the average change in travel time associated with a 1-mile 
increase in distance traveled is .060 hr. When distance 
traveled is held fixed, the average change in travel time 
associated with one extra delivery is .900 hr. c. .9861 


a. 77.3 b. 40.4 


a. f = 475, P-value = 0b. 20,826.14 d. (—6694.020, 
—5895.438) e. t = 2.59, P-value = .01, retain x, in the 
model. 


a. 48.31, 3.69 b. No. If x, increases, either x, or x, must 
change. c. Yes, since f = 18.924, P-value = .001. 
d. Yes, using a = .01, since t= 3.496 and P-value = .003. 


a. f = 87.6, P-value = 0, so there does appear to be a 
useful linear relationship between y and at least one of the 
predictors. b. .935 ce. (9.095, 11.087) 


b. P-value = .000, so conclude that the model is useful. 
c. P-value = .034 = .05 =a, so reject Hy: B;=0; % 
garbage does appear to provide additional useful 
information. d. (1479.8, 1531.1), reasonable precision 
e. A 95% PI is (1435.7, 1575.2). 


a. f = 17.31, P-value = .000, utility of the model is con- 
firmed. b. t = 3.96, P-value = .002, retain the interac- 
tion predictor. c. (5.73, 8.17) d. (2.97, 10.93). 


.a. t = .30, P-value = .777, so delete x, b. f = 
15.29, .001 < P-value < .01, so confirm model utility 


55. 


57. 


59, 
61. 
63. 


65. 


67. 


69. 


71. 


Answers to Selected Odd-Numbered Exercises A-45 
at significance level .05. ec. (—.01180, —.00174) 
d. (2.93, 3.81) e. A normal probability plot of e* is 
quite straight, and plots of e* versus x, and e* versus x, 
show no discernible pattern. 


a. f = 6.40, .01 < P-value < .05, so at signficance level .05 
model utility is confirmed. b. No. Since P-value = 


510, Hy: B; = O cannot be rejected. ct = 4.69, 
P-value = .001, so model utility is confirmed. d. That 
a nonlinear model should be fit. e. f = 20.36, 


P-value < .001, model utility is confirmed; (30.81, 36.97) 


k R? adj. R? C, 
1.676 .647 138.2 

2 979 975 2.7 
3.9819 .976 3.2 
4.9824 4 

a. The model with k = 2 b. No 


a. The model with predictors x,, x;, and x5. 
No. All R? values are much less than .9. 


The impact of these two observations should be further 
investigated. Not entirely. The elimination of observation 
#6 followed by re-regressing should also be considered. 


a. The two distributions have similar amounts of variability, 
are both reasonably symmetric, and contain no outliers. The 
main difference is that the median of the crack values is 
about 840, whereas it is about 480 for the no-crack values. A 
95% t CI for the difference between means is (132, 557). 


b. 7? = .577 for the simple linear regression model, 
P-value for model utility = 0, but one standardized resid- 
ual is —4.11! Including an indicator for crack—no crack 
does not improve the fit, nor does including an indicator 
and interaction predictor. 


a. When gender, weight, and heart rate are held fixed, we 
estimate that the average change in VO,max associated with 
a 1-minute increase in walk time is —.0996. __b. When 
weight, walk time, and heart rate are held fixed, the esti- 
mate of average difference between VO,max for males and 
females is .6566. c. 3.669, —.519 d. .706 


e. f= 9.0, P-value < .001, so there does appear to be a 
useful relationship. 


a. No. There is substantial curvature in the scatterplot. 
b. Cubic regression yields R* = .998 and a 95% PI of 
(261.98, 295.62), and the cubic predictor appears to be 
important (P-value = .001). A regression of y versus In(x) 
has r? = .991, but there is a very large standardized resid- 
ual and the standardized residual plot is not satisfactory. 


a. R? = .802, f = 21.03, P-value = .000. pH is a candi- 
date for deletion. Note that there is one extremely large 
standardized residual. 

b. R? = .920, adjusted R? = .774, f= 6.29, P-value = .002 
c. f = 1.08, P-value > .10, don’t reject Hp: By = +7 = 
B59 = 0. The group of second-order predictors does not 
appear to be useful. 
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A-46 Answers to Selected Odd-Numbered Exercises 


73. 


75. 


d. R? = .871, f = 28.50, P-value = .000, and now all six 
predictors are judged important (the largest P-value for any 
t-ratio is .016); the importance of pH? was masked in the 
test of (c). Note that there are two rather large standardized 
residuals. 


a. f = 1783, so the model appears useful. 

b. t = —48.1, P-value = 0, so even at level .001 the qua- 
dratic predictor should be retained. 

c. No d. (21.07, 21.65) e. (20.67, 22.05) 


a. f = 30.8, P-value < .001, so the model appears useful. 
b. t = —7.69 and P-value < .001, so retain the quadratic 


predictor. ec. (44.01, 47.91) 


77. 


79. 
81. 


83. 


a. At significance level .05, yes, since f= 4.06 and 
P-value = .029. b. Yes, because f= 20.1 and 
F 053.17 = 3-49. The full versus reduced F' test cannot be 
used since the predictors in this model are not a subset of 
those in (a). 

There are several reasonable choices in each case. 


a. f = 106, P-value ~ 0 b. (.014, .068) 
c. t= 5.9, reject Hy: B, = 0, percent nonwhite appears to 
be important. d. 99.514, y — } = 3.486 


a. Estimates of a, B,, and B, are 52,912.77, —1.2060, 
and —1.3988, respectively. b. R? = .782, f = 42.95, 
P-value = 0 c. P-values for testing Hj: B, = 0 and 
Hy: B, = 0 are both 0 d. A 95% PI is (14.18, 174.51) 


Chapter 14 


1. 


eo NI mW w 


11. 


13. 


15. 


17. 


19. 


21. 


23. 


a. Reject Hp. 
c. Don’t reject Ho. 


b. Don’t reject Hp. 
d. Don’t reject Hp. 


. Xx” = 4.80, P-value > .10, so don’t reject Hp. 


xX’ = 6.61, P-value > .10, so don’t reject Hp. 
x’ = 4.03 and P-value > .10, so don’t reject Hp. 


. a. [0, .2231), [.2231, .5108), [.5108, .9163), [.9163, 


1.6094), and [1.6094, °%) b. x? = 1.25, P-value > .10, 
so the specified exponential distribution is quite plausible. 


a. (—%, —.97), [—.97, —.43), [—.43, 0), 0, .43), [.43, .97), 
and [.97, ~) b. (—, .49806), [.49806, .49914), 
[.49914, .5), [.5, 50086), [.50086, 50194), and [.50194, ~) 
c. X° = 5.53, V9.5 = 9.236, so P-value > .10, and the 
specified normal distribution is plausible. 


Pp = .0843, x? = 280.3, P-value <.001, so the model gives 
a poor fit. 


The likelihood is proportional to 67°(1 — 6)%°7, from which 
6 = .3883. The estimated expected counts are 21.00, 53.33, 
50.78, 21.50, and 3.41. Combining cells 4 and 5, y’ = 1.62, 
so don’t reject Hp. 


ft = 3.88, estimated expected counts are 6.2, 24.0, 46.6, 
60.3, 58.5, 45.4, 29.4, 16.3, and 13.3, from which x? = 


7.8, P-value > .10, so the Poisson distribution provides a 
good fit. 


6, = (2n, +n; + ns)/2n = 4275, 6, = .2750, x? = 29.1, 
P-value < .001, so reject Ho. 


Yes. The null hypothesis of a normal population distribu- 
tion cannot be rejected. 


Minitab gives r= .967, and since cj) = .9707 and 
Cos = 9639, .05 < P-value < .10. Using a = .05, nor- 
mality is judged plausible. 
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25 


27. 
29. 


31. 


35. 


37. 
39. 


41. 


43. 


47. 


49. 


\? = 212.9, df = 6, P-value = 0, there appears to be an 
association due to younger people tending to drink more. 


Yes. x” = 44.98 and P-value < .001. 


a. Yes, since y? = 213.2, P-value = 0. b. Not at any 
reasonable significance level, since y* = 3.1, P-value > .10. 


a. Yes. M: .26, .25, .29, .20: F: .11, .18, .34, .37 
b. Reject H, at significance level .05 or .01, since x” = 14.46 
so .001 < P-value < .005. 


N,./n, 4Njj./n, 24 
xX = 3.65, P-value > .10, so Hy cannot be rejected. 


b. No. Since P-value = .023, the null hypothesis of no 
association can't be rejected at significance level .01 (but 
would be at level .05). 


x’ = 22.4 and P-value < .001, so the null hypothesis of 
independence is rejected. 


P-value = 0, so the null hypothesis of homogeneity is 
rejected. 


a. Test statistic value = 19.2, P-value = 0 

b. Evidence of at best a weak relationship; test statistic 
value = —2.13 

c. Test statistic value = —.98, P-value > .10 

d. Test statistic value = 3.3, .01 < P-value < .05 


Combining 6 and 7 into one category and 8 and 9 into 
another gives a test based on 6 df for which y” = .92 and 
P-value > .9! 


Answers to Selected Odd-Numbered Exercises A-47 


Chapter 15 


1. a. P-value = 102 b. .026 < P-value < .046 c. .055 < 19. 
. (dijs)s dow) = (16, 87) 

. k = 14.06, .001 < P-value < .005, so reject Ho. 
. k = 9.23, P-value ~ .01, so reject Hp. 

. f, = 2.60, P-value > .10, so don’t reject Hp. 


P-value <.102 d. P-value=.05  e. z=3.7, P-value ~ 0. 1 
. 5, = 18, .02 < P-value < .05, so Hp is rejected. 


2 23 
5. s, = 72, P-value < .01, so Hy is rejected. 25 
7. 8, = 424, z = 2.56, P-value ~ .0052,, so reject Hp. 37 

9. d 0 2 4 6 8 10 12 14 16 18 20 
mp3 14 222 4131 29. 

P 24 24 24 24 24 24 24 24 24 24 24 
P-value = .167 31. 
11. w = 38, .008 < P-value < .028, so H) is rejected. 33. 


13. w = 25, P-value > .053, so don’t reject Hp. 


15. w = 39 P-value = .027, so H, is rejected at significance 35. 


level .05. 
17. (X¢), X32) = (11.15, 23.80) 


(—.585, .025) 


f, = 9.62, .02 < P-value < .025, so reject Hy at signifi- 
cance level .05. 


(—5.9, —3.8) 


a. 021 b. y= 12, P-value = .252, so H, cannot be 
rejected. 


w’ = 26, P-value > .056, so don’t reject Hp. 


Chapter 16 


1. All points on the chart fall between the control limits. 31. 
3. .9802, .9512, 53 33. 


Sai 1676). bel6? “ot,=c, 
(USL + LSL)/2 


7. a. .0301 b. .2236 c. .6808 
9. LCL = 12.20, UCL = 13.70. No. 
11. LCL = 94.91, UCL = 98.17. There appears to be a prob- 35 


= when pw = 


lem on the 22nd day. 
13. a. 200 b. 4.78 c. 384.62 (larger), 6.30 (smaller) 37 
15. LCL = 12.37, UCL = 13.53 39 


17. a. LCL = 0, UCL = 6.48 b. LCL = .48, UCL = 6.60 
19. LCL = .045, UCL = 2.484. Yes, since all points are 


inside the control limits. 41. 


21. a. LCL = .105, UCL = .357__b. Yes, since .39 > UCL. 
23. p > 3/53 
25. LCL = 0, UCL = 10.1 4B 


27. When area = .6, LCL = Oand UCL = 14.6; when area = 
.8, LCL = 0 and UCL = 13.4; when area = 1.0, LCL = 0 
and UCL = 12.6. 


29. 1: 1 2 3 4 


5 
d: 0 001 017 0 0 010 0 O 
e. 0 0 0 038 860 0 0 .054 
ie 29 10 11 12 13 14 15 45 
d: 0  .024  .003 0 0 0  .005 
er 0 0 0 O15 0 0 0 


There are no out-of-control signals. 


n= 5,h = .00626 

Hypergeometric probabilities (calculated on an HP21S 
calculator) are .9919, .9317, .8182, .6775, .5343, .4047, 
.2964, .2110, .1464, and .0994, whereas the correspond- 
ing binomial probabilities are .9862, .9216, .8108, .6767, 
5405, .4162, .3108, .2260, .1605, and .1117. The approxi- 
mation is satisfactory. 


. 9206, .6767, .4198, .2321, .1183; the plan with 


n= 100, c = 2 is preferable. 


. .9981, 5968, and .0688 
. a. 010, .018, .024, .027, .027, .025, .022, .018, .014, .O11 


b. .0477, .0274 ¢. 77.3, 202.1, 418.6, 679.9, 945.1, 
1188.8, 1393.6, 1559.3, 1686.1, 1781.6 


X chart based on sample standard deviations: LCL = 
402.42, UCL = 442.20. X chart based on sample ranges: 
LCL = 402.36, UCL = 442.26. S chart: LCL = .55, UCL 
= 30.37. R chart: LCL = 0, UCL = 82.75. 


. S chart: LCL = 0, UCL = 2.3020; because s,, = 2.931 > 


UCL, the process appears to be out of control at this 
time. Because an assignable cause is identified, recal- 
culate limits after deletion: for an S chart, LCL = 0 
and UCL = 2.0529; for an X chart, LCL = 48.583 and 
UCL = 51.707. All points on both charts lie between the 
control limits. 


. x = 430.65, s = 24.2905; for an S chart, UCL = 62.43 


when n = 3 and UCL = 55.11 when n = 4; for an X chart, 
LCL = 383.16 and UCL = 478.14 when n=3, and 
LCL = 391.09 and UCL = 470.21 when n = 4. 
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Glossary of Symbols/ 


Abbreviations 


Symbol/ Symbol/ 
Abbreviation Page Description Abbreviation Page Description 
n 13 sample size Pra 70 number of permutations of size k 
x 13 variable on which observations from n distinct entities 
d 
ie S 1B a ee eines (") 71 number of combinations of size k 
moe : from n distinct entities 
ps x; 30 sum Of x),.X),.-.5%, P(A|B) 75 conditional probability of A given 
i=l 
x 29 sample mean at capeaas 
iz 30 population mean Iv 96 random variable 
N 30 population size when the J 96 a random variable . : 
population is finite X(s) 96 value of the rv X associated with 
x 31 sample median the onmenes 
~ 32 population median x 96 some particular value of the rv x 
a 33 He een D(x) 99 probability distribution (mass 
aa 34 sample proportion function) of a discrete rv X 
a 37 sample variance pmf 100 probability mass function 
Ss 37 sample standard deviation P(x; ex) 103 pmf with peraicter a . 
gw 38 population variance and standard F(x) 104 cumulative distribution function 
deviation ora re : 
ae 39 degrees of freedom for a single cdf 104 cumulative distribution function 
sample = 106 largest possible X value smaller 
Sy 39 sum of squared deviations from vane 
; the sample mean E(X), [Ly, bh 110 mean or expected value of the 
de 40 sample fourth spread ~ : 
e 53 sample space of ati experiment E{h(X)] 112 expected value of the function h(X) 
A.B.C.C 54 ane cae V(X), o%07 =114 variance of the rv X 
a eee 55 complement of the event A Oy, o 114 standard deviation of the rv X 
AUB 55 union of the events A and B S, F 118 success/failure on a single trial of 
ANB 55 intersection of the events A and B a binomial experiment ; 
ia) 56 the null event (event containing n 118 number of trials in a binomial 
no outcomes) epeninen ; 
P(A) 58 probability of the event A Dp 118 probability of success on a single 
N 63 number of equally likely outcomes trial of a binomial or negative 
N(A) 64 number of outcomes in the event A : binomial CEDEMINENE a 
hy Ms 69 number of ways of selecting Ist X~Bin(n, p) 120 the rv X has a binomial distribution 


(2nd) element of an ordered pair 


with parameters n and p 
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G-2 Glossary of Symbols/Abbreviations 
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Symbol/ Symbol/ 
Abbreviation Page Description Abbreviation Page Description 
D(x; n, p) 120 binomial pmf with parameters n 0,, 0, 190 location and scale parameters 
and p p(x, y) 199 = joint pmf of two discrete rv’s X 
B(x; n, p) 121 cumulative distribution function and Y 
of a binomial rv Px(X), Py) 200 marginal pmf’s of X and Y, 
M 126 number of successes in a respectively 
dichotomous population of KO), Fro”) 202 marginal pdf’s of X and Y, 
size N respectively 
h(x; n, M, N) 126 hypergeometric pmf with PO, -+ +5 %_) 206 = joint pmf of then rv’s X,,..., X,, 
parameters n, M, and N PX 5) 206 = joint pdf of then rv’s X;,..., X,, 
r 129 number of desired successes in a Fux |x) 209 conditional pdf of Y given 
negative binomial experiment that X = x 
nb(x; r, p) 129 negative binomial pmf with Py| x0 |x) 209 conditional pmf of Y given 
parameters r and p that X = x 
L 131 parameter of a Poisson distribution E(Y|X = x) 209 expected value of Y given 
D(x; fL) 131 Poisson pmf that X = x 
F(x; pb) 132 Poisson cdf E[h(X, Y)] 213. expected value of the function 
At 134 length of a short time interval W(X, Y) 
o(At) 134 quantity that approaches 0 faster Cov(X, Y) 214 — covariance between X and Y 
than At does Corr(X, Y), Pyy,p 216 — correlation coefficient for X and Y 
a 134 rate parameter of a Poisson process x 221 the sample mean regarded as an rv 
a(t) 134 rate function of a variable-rate S? 222 the sample variance regarded as an rv 
Poisson process CLT 232 ~=Central Limit Theorem 
pdf 143 probability density function 0 248 generic symbol for a parameter 
f(x) 143 probability density function of a 0 248 point estimate or estimator of 0 
continuous rv X MVUE 256 minimum variance unbiased 
f(x; A, B) 144 —_ uniform pdf on the interval [A, B] estimator (or estimate) . 
F(x) 148 cumulative distribution function F,5§ 259 estimated standard deviation of 6 
HP) 151 100pth percentile of a continuous Kissa g Ry 259 bootstrap sample 
distribution 6 259 estimate of @ from aboot 
im 152 median of a continuous distribution strap sample 
T(x; Bb, o) 157 pdf of a normally distributed rv mle 270 maximum likelihood estimate 
N(w, 07) 158 normal distribution with (or estimator) 
parameters 2 and o? CI 277 confidence interval 
Z 158 a standard normal rv 100(1 — a)% 281 confidence level for a CI 
z curve 158 standard normal curve T 295 variable having a ¢ distribution 
D(z) 158 cdf of a standard normal rv v 296 degrees of freedom (df) 
Zs, 160 value that captures upper-tail parameter for a f distribution 
area a under the z curve t 296  ¢ distribution with v df 
A 170 parameter of an exponential ies 296 value that captures upper-tail area 
distribution a under the ¢, density curve 
T(x; A) 170 exponential pdf PI 300 prediction interval 
T(a) 172 the gamma function Xv 304 value that captures upper-tail area 
T(x; a, B) 173 gamma pdf with parameters a a under the chi-squared 
and B density curve with v df 
df 175 degrees of freedom Ay 311 null hypothesis 
v 175 number of df for a chi-squared A, 311 alternative hypothesis 
distribution a 319 significance level, probability 
T(x; a, B) 177 Weibull pdf with parameters a of a type I error 
and B B 319 probability of a type II error 
f(x bo) 179 lognormal pdf with parameters Mo 327 null value in a test concerning 
anda Z 327 __ test statistic based on standard 
f(x; a, B,A,B) 181 beta pdf with parameters normal distribution 
a, B, A, B pb’ 330 alternative value of uw ina B 


calculation 


Glossary of Symbols/Abbreviations G-3 


Symbol/ Symbol/ 
Abbreviation Page Description Abbreviation Page Description 
B(w') 330 type II error probability when bw 426 average of population means in 
h=p’ single-factor ANOVA 
T 335 test statistic based on ¢ distribution Oy... Ay 426 treatment effects in a single- 
% 346 null value in a test concerning 6 factor ANOVA 
Po 347 null value in a test concerning p €;; 426 deviation of X;; from its mean 
p' 347 alternative value of p in a B value 
calculation Jigcsagdy 430 individual sample sizes in a 
BO’) 348 type II error probability when single-factor ANOVA 
p=p' n 430 total number of observations in a 
Oo, Og 355 disjoint sets of parameter single-factor ANOVA data set 
values in a likelihood ratio test Apc s np Ay 432 random effects in a single-factor 
m,n 362 sample sizes in two-sample ANOVA 
problems A,B 437 factors in a two-factor ANOVA 
Ay 363 null value in a test concerning K; 437 number of observations when 
By — Bo factor A is at level i and factor 
A’ 366 alternative value of w, — pina B is at level j 
B calculation LJ 437 number of levels of factors A and 
S? 378 pooled estimator of o? a B, respectively 
D, 383 the difference X, — Y, for the i Xj 439 average of observations when 
pair (X;, Y,) A (B) is at level i (7) 
d, 8p 385 sample mean difference, sample Mi 439 expected response when A is at 
standard deviation of level i and B is at level j 
differences for paired data a, B; 441 effect of A (B) at level i (j) 
P 392 common value of p, and p, taste 442 F ratios for testing hypotheses 
when p, = Pp, about factor effects 
F 399 rv having an F distribution A, B; 448 factor effects in random 
Vy), V> 399 numerator and denominator df effects model 
for an F distribution 7, OF 448 variances of factor effects 
QP 399 value capturing upper-tail area K 451 sample size for each pair (i, /) 
a under an F curve with v,, v, df of levels 
ANOVA 409 analysis of variance Vii 451 interaction between A and B at 
I 410 number of populations in a levels i and j 
single-factor ANOVA A,, B; Gi 456 effects in mixed or random 
J 412 common sample size when effects models 
sample sizes are equal a;, B;, Ox 456 main effects in a three-factor 
Xin Xi 412 jth observation in a sample from ANOVA 
_ the ith population vy? ve, Vie 460 two-factor interactions in a 
;. 412 mean of observations in sample three-factor ANOVA 
_ from ith population Vik 460 three-factor interaction in a 
X.. 412 mean of all observations in a three-factor ANOVA 
data set LIK 460-461 number of levels of A, B, C 
MSTr 413 mean square for treatments in a three-factor ANOVA 
MSE 413 mean square for error By, Bo 491 slope and intercept of population 
F 415 test statistic based on F distribution regression line 
x; 415 total of observations in ith sample € 49] deviation of Y from its mean 
Xs 415 grand total of all observations value in simple linear regression 
SST 416 total sum of squares o 491 variance of the random deviation € 
SSTr 416 treatment sum of squares eee 492 mean value of Y when x = x* 
SSE 416 error sum of squares Cw 492 variance of Y when x = x* 
m, Vv 420 parameters for Studentized range Bi. Bo 496 least squares estimates of 
distribution B, and By 
On nis 420 value that captures upper-tail S. 498 (x; — HG; — ¥) 
area a under the associated 3, 500 predicted value of y when x = x; 
Studentized range density curve SSE 502 error (residual) sum of squares 
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G-4 Glossary of Symbols/Abbreviations 


Symbol/ Symbol/ 
Abbreviation Page Description Abbreviation Page Description 
SST 504 total sum of squares S,, nj 640 total number of sampled 
fe 504 coefficient of determination individuals in category j 
Sa, 512 estimated standard deviation of B, Pij 640 proportion of population i in 
ry 528 sample correlation coefficient category j 
e; 543 a standardized residual ej 641 estimated expected count in 
BG@=1,...,k) 562 coefficient of x! in polynomial cell i, j 
regression ni 643 number in sample falling into 
B; 563 least squares estimate of B; category i of 1st factor and 
R 565 coefficient of multiple category j of 2nd factor 
determination Pij 643 proportion of population in 
B; 567 coefficient in centered category i of 1st factor and 
polynomial regression category j of 2nd factor 
B; 72. population regression coefficient Sy 654 signed-rank statistic 
of predictor x; W 662 rank-sum statistic 
B 576 least squares estimate of B; K 672 Kruskal-Wallis test statistic 
SSE,, SSE, 585 SSE for full and reduced models, Ri 672 rank of X;, among all NV 
respectively observations in the data set 
lr, 600 normalized expected total R,. 672 average of ranks for observations 
estimation error in the sample from population 
Cy 600 estimate of I’, or treatment i 
hi 604 coefficient of y, in J; F, 674 Friedman’ test statistic 
Xow 621 value that captures upper-tail UCL 679 upper control limit 
area a under the x? curve LCL 679 lower control limit 
with v df C,, Cor 680-68 1 process capability indices 
a 622 test statistic based on a R 685 sample range 
chi-squared distribution ARL 687 average run length 
Pio +++ > Pxo 622 null values for a chi-squared test IQR 688 interquartile range 
of a simple Hy CUSUM 700 cumulative sum 
(0) 628 category probability as a function OC 709 operating characteristic 
of parameters 0),..., 9, AQL 710 acceptable quality level 
LJ 639 number of populations and LTPD 710 lot tolerance percent defective 
categories in each population AOQ 713 average outgoing quality 
when testing for homogeneity AOQL 713 average outgoing quality limit 
LJ 639 numbers of categories in each of ATI 713 average total number inspected 
two factors when testing for 
independence 
n 640 number of individuals in sample 
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from population i who fall into 
category j 


Index 


Note: Page numbers preceded by the letter “A” indicate appendix table page numbers and the page numbers followed by a “f” indicate figures. 


A 


Acceptable quality level (AQL), 710 
Acceptance sampling, 708-713 
designing a single-sampling plan, 710-711 
double-sampling plans, 711-712 
rectifying inspection and other design criteria, 
712-713 
single-sampling plans, 709-710 
standard sampling plans, 713 
Additive model, 439-441 
Adjusted coefficient of multiple determination, 
565 
Adjusted residual plot, 580, 587 
Aliased with, 477 
Alias pairs, 477 
Alternative hypothesis, 311-312 
Analysis of variance. See ANOVA 
Analytic studies, 9-10 
ANOVA (analysis of variance), 409-436, 
437-486 
defined, 409 
distribution-free, 671-675 
expected mean squares, 443-444 
fixed effects model, 432, 439-443, 460-464, 
451-452 
F test, 399-402, 414-416, 429 
Friedman’s test, 673-674 
introduction, 409-410 
Kruskal-Wallis test, 671-673 
Latin square designs, 464-466 
model equation, 426-429 
multifactor, 437-486 
multiple comparisons procedure, 420-426, 444, 
455 
noncentrality parameter, 427-428 
notation and assumptions, 412-413 
random effects model, 432-433 
randomized block experiments, 444-447 
regression and, 516 
sample sizes, 430-431 
single-factor, 409, 410-420, 426-435 
sums of squares, 416-419 
table, 417-418, 419, 422-423 
test procedures, 363-365, 411, 413, 452-455, 
460-464 
test statistic, 413-414 
three-factor, 460-469 
transformations, 413, 426, 431-432 
two-factor, 399-403, 438-459 
See also Single-factor ANOVA; Three-factor 
ANOVA; Two-factor 
ANOVA 
Ansari—Bradley test, 677 
Assignable causes, 679, 682 
Asymptotic relative efficiency (ARE), 659-660 
Attribute data 
control charts for, 695-700 
explanation of, 695 
Average outgoing quality, 713 
Average outgoing quality limit (AOQL), 713 


Average total number inspected (ATI), 713 
Axioms, of probability, 58-59 


B 


Backward elimination method, 602 
Bayes’ theorem, 80-82 
Bernoulli distribution, 122, 236 
Bernoulli random variable, 97 
Beta distribution, 181-182 
Biased estimator, 252, 257 
Bimodal histogram, 22 
Binomial distribution, 117-125 
approximating, 117, 119, 165-166 
defined, 118 
and hypergeometric distribution, 126-128 
negative, 128-130 
normal approximation for, 117 
Poisson distribution and, 131-136 
probability and, 117-125, 165 
rule of, 119 
tables, A-2—A-4 
theorem, 120 
Binomial experiment, 118 
Binomial random variable, 119 
Bivariate, 218-219 
data, 4 
normal distribution, 218-219 
Blocking 
confounding and, 474-477 
randomized block experiments and, 444-447 
Bonferroni inequality, 523 
Bonferroni intervals, 523 
Bootstrap method, 260-261 
confidence intervals and, 284 
estimate of standard error and, 259-261 
Bound on error of estimation, 282 
Box, George, 679 
Boxplots, 40-41 
comparative, 43-44 
defined, 40 
outliers shown in, 42 
“Broken stick” model, 153 


& 


Calibration, 326, 346, 656 
Categorical data, 3-4, 34, 23 
analysis of, 619-627 
sample proportions and, 34 
Categorical variables, 574-576 
Cauchy distribution, 257—258 
Causality, comparison identifying, 365-366 
Causation, correlation vs., 216-218 
c control chart, 697-698 
Cell counts 
estimated expected, 629-633, 635-636, 644 
expected, 621, 623-626 
observed, 621-622 
Censoring, 258-259 


Census, 3 
Centering x values in regression, 567-568 
Central Limit Theorem (CLT), 232-235 
alternative applications for, 235-236 
binomial distribution and, 235 
lognormal distribution and, 236 
Poisson distribution and, 236 
rule of thumb for, 234 
Chebyshev’s inequality, 117 
Chi-squared distribution, 174-175, 621-623 
critical values for, 304, 622, 636, A-11 
curve tail areas, 622, A-21-A-22 
degrees of freedom for, 399, 621, 629, 631, 
641, 644, 645 
goodness-of-fit tests and, 642-643, 641-646 
Chi-squared tests 
goodness-of-fit, 627-637 
homogeneity, 640-643 
independence, 643-646 
normality, 636-637 
P-values for, 624-625, 633, 635 
Classes, 18 
Classical confidence interval, 280 
Class intervals, 18, 21-22 
Coefficient of determination, 503-505 
Coefficient of multiple determination, 565, 578 
Coefficient of variation, 48, 195, 332 
Combinations, 69-73 
Comparative boxplot, 43-44, 46, 47 
Comparative stem-and-leaf display, 25-26 
Complement of an event, 55 
Complete layout, 464 
Composite hypotheses, 627-639 
Compound event, 54 
Conceptual population, 7 
Conditional distributions, 209 
Conditional probability, 75-85 
Bayes’ theorem and, 80-82 
multiplication rule and, 77-80 
Conditional probability density function, 209 
Conditional probability mass function, 209 
Confidence bound, 291-292 
Confidence intervals, 277-309 
basic properties of, 277-285 
Bonferroni, 523 
bootstrap, 284 
bounds, 291-292 
classical, 280 
confidence levels for, 281-282 
correlation coefficient, 527-537 
defined, 6 
derivation of, 282-284 
difference between means, 369-371, 378 
difference between proportions, 391-398 
distribution-free, 667-67 1 
exponential distribution, 170-172 
future value, prediction of, 299-300, 519-527 
general, 288 
hypothesis testing and, 353-354 
interpretation of, 279-280 
introduction, 276 
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I-2 Index 


Confidence intervals (continued) 
large-sample, 285-294, 396-397 
levels of confidence and, 280-281 
mean difference, 362-373, 378 
multiple regression, 504—505 
nonnormal distribution, 301-302 
normal distribution, 286 
normal population distribution, 295-304 
one-sample t, 297-299 
one-sided, 291-292 
paired data and, 385-386 
paired t, 383-385 
parametric functions, 424-425 
Poisson distribution, 131-136 
polynomial regression, 562-571 
population mean, 110, 277, 285-294 
population mean difference, 362-373, 378 
population proportion, 289-291, 391-398 
precision of, 281-282 
and prediction intervals, 299-300 
properties, 277-285 
random interval, 278 
ratio of standard deviations, 402 
ratio of variances, 399-402 
sample size and, 281-282 
score, 289-290 
sign, 653-654 
simple linear regression, 487-496 
simultaneous, 354—355 
slope, 488, 510-512 
slope of regression line, 510-519 
standard deviation, 304—306 
t distribution, 295-297 
tolerance, 300-301 
two-sample t, 374-379, 386-387 
uniform distribution, 144-145, 148f, 149f 
variance, 304-306 
Wilcoxon rank-sum, 669-671, A-25 
Wilcoxon signed-rank, 667-669, A-24 
Confidence levels, 281-282 
simultaneous, 354—355 
Wilcoxon signed-rank interval, A-26 
Wilcoxon rank-sum interval, A-27—A-28 
Confounding, 474-477 
Consistent estimator, 274 
Contingency tables, 639-648 
Continuity correction, 165 
Continuous distribution, 142-146, 177-183, 
625-626 
goodness of fit for, 633-636 
mean of, 657 
median of, 152 
percentiles of, 150-152 
variance of, 179, 181 
Continuous random variable, 98, 201-204 
cumulative distribution function, 104—107 
expected values, 147-156 
gamma distribution, 173 
jointly distributed, 201-204 
probability distribution of, 95, 98, 201-204 
standard deviation of, 154 
variance of, 154, 180. 
Continuous variable, 98 
Contrasts, 470 
Control charts, 679-700 
attribute data, 695-700 
based on known parameters, 68 1-683 
c chart, 697-698 
CUSUM procedures, 700-708 
estimated parameters, 683-685 
general explanation, 679-68 1 
location, 681, 690-691 
p chart, 696-697 


performance characteristics, 686-687 
probability limits, 694 
process location, 681-690 
process variation, 690-695 
R chart, 692-694 
recomputing control limits, 685-686 
robust, 688 
S chart, 691-692 
supplemental rules for, 688 
transformed data, 698-699 
variation, 690-695 
Control limits, 684-686 
recomputing, 685-686 
sample ranges, 685 
Convenience sample, 10 
Convex function, 197 
Correction factor for mean, 128, 417 
Correlation, 216-218, 527-537 
causation vs., 216-218 
joint probability distributions and, 216-218 
linear relationship and, 217-218 
testing for absence of, 531-532 
other issues in, 532-534 
Correlation coefficient, 216-217 
bivariate normal distribution, 218-219 
confidence interval, 512-514 
hypothesis testing, 514-516 
multiple, 579 
point estimation, 530-532 
population, 530-533 
properties of r, 529-530 
random variables, 216 
sample, 527-530 
Counting techniques, 66-75 
Covariance, 214-216 
joint probability distributions and, 
214-216 
Coverage probability, 290 
Critical values, A-2—A-28 
chi-squared, 304, A-21-A-22 
F, 399, A-14—A-19 
Ryan—Joiner test, A-23 
standard normal, 160-161 
studentized range, A-20 
t, 296, A-9 
tolerance, A-10 
Wilcoxon rank-sum interval, 669-671, 
A-27-A-28 
Wilcoxon rank-sum test, 661-666, A-25 
Wilcoxon signed-rank interval, 667-669, 
A-26 
Wilcoxon signed-rank test, 653-661, A-24 
z, notation, 160-161 
Cross-validation, 560 
Cubic regression, 563, 570-571 
Cumulative binomial probabilities, A-2—A-4 
Cumulative distribution function, 104—107, 
147-156 
Cumulative frequency, 29 
Cumulative Poisson probabilities, A-4—A-5 
Curtailment, 712 
CUSUM (cumulative sum) procedures, 700-708 
computational, 703—706 
designing, 706-708 
V-mask, 700-703 


D 


Danger of extrapolation, 499 
Data, 3, 9-12 
attribute, 695-700 
bivariate, 4 
categorical, 34, 619-627 


collecting, 10-12 
continuous, 19-20 
discrete, 17 
multivariate, 4, 24 
paired, 382-391 
qualitative, 23-24 
transformation, 431-432, 698-699 
types, 24, 445, 447, 453, 463, 601, 673, 674, 
682, 689-690, 691-692, 693 
univariate, 3 
variability, for sample data, 36-38 
Degrees of freedom, 304 
chi-squared distribution, 175, 304, 
399-400, 621, 629, 631, 641, 644, 645 
F distribution, 399, 414-416, 429 
goodness-of-fit tests, 621, 629, 631, 641, 644, 
645 
homogeneity test, 640-643 
independence test, 643-646 
multiple comparisons, 455 
paired vs. unpaired experiment, 387-388 
pooled t, 377-378 
p values, 342 
regression, 584 
sample variance, 38-39, 304, 399-402 
single-factor ANOVA, 409, 410-420, 
426-435 
single sample, 39 
t distribution, 295-297, 298, 335 
two-sample t, 374-379 
Deleted observation regression, 605 
Deming, W. E., 9, 708 
Density curve, 143, 144, 145, 227 
Density estimate, 22 
Density scale, 20-22 
Dependent events, 85 
Dependent random variables, 204 
Dependent variable, 488 
Derivations from mean, 225-226, 669 
Descriptive statistics, 4-51 
overview, 1-3 
populations, samples and processes, 3—12 
pictorial and tabular methods, 13-29 
measures of location, 29-36 
measures of variability, 36-51 
Deterministic relationship, 487-488, 529 
Deviations from the mean, 36-38 
Diagnostic plots for model adequacy, 544—545 
difficulties and remedies, 546-547 
Diagram 
Pareto, 29, 147 
tree, 67-68 
Venn, 56 
Discrete distribution, 631-633 
Discrete population, 164—165 
Discrete random variable, 98 
cumulative distribution function, 104-107 
expected value, 109-117 
introduction to, 95-99 
jointly distributed, 199-201 
probability distributions, 99-109 
variance, 113-114 
Discrete uniform distribution, 37 
Discrete variable, 16 
Disjoint events, 56, 58-59, 62, 98 
Distribution-free ANOVA, 671-675 
Friedman test, 673-674 
Kruskal-Wallis test, 671-673 
Distribution-free confidence intervals, 667—67 1 
Wilcoxon rank-sum test, 661-666 
Wilcoxon signed-rank test, 653-661 
Distribution-free test procedures, 356 
ANOVA, 671-675 
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sign, 653-654 
Wilcoxon rank-sum test, 669-671, A-25 
Wilcoxon signed-rank test, 667-669, A-24 
Distribution function. See Cumulative 
distribution function 
Dotplot, 15-16, 32 
Double-blind experiment, 395 
Double-sampling plans, 711-712 
Dummy variable, 574 
Dunnett’s method, 425 


iE 


Effects 
fixed, 439-441, 451-452, 460-464 
main, 461-462, 464 
mixed, 448, 456-457 
random, 432-433, 448, 456-457 
Efficiency ratio, 494 
Efron, Bradley, 284 
Empirical rule, 163 
Enumerative studies, analytic v., 9-10 
Equally likely outcomes, 63-64 
Error probabilities, 122, 319 
Error, 251 
error-free test procedures, 318 
of estimation, bound on, 259, 282 
experimentwise error rate, 355, 424 
horizontal and vertical, 196 
hypothesis test, 317-323 
mean square, 251, 274, 413, 581 
measurement, 154, 156-157, 185-186, 234, 
251, 273 
positive, 195 
prediction, 299-300 
probabilities of, 122, 319 
random error in regression, 491 
random error variance, 451, 492, 595 
standard, 41, 259-261, 514 
systematic, 251 
type I, 317-324, 349, 354-355, 356, 367, 
387, 514, 680, 710 
type IL, 317-318, 320-323, 330-331, 333, 
340, 348, 350, 366, 367, 378-379, 387, 
394-395, 427, 680, 710 
unbiased estimator error, 251 
variance analysis, 514 
Error probabilities, 122 
Error sum of squares, 416-417, 502 
Estimated expected cell counts, 629-630, 631— 
633, 635 
Estimated regression line, 496-506 
Estimated standard error, 259-261 
Estimate 
bootstrap, 260-261 
interval, 5, 276, 281-282, 361, 396, 523 
least squares, 496-503 
point, 247-275 
Estimation. See Point estimation 
Estimator. See Point estimator 
Event(s), 54-57 
complement of, 55 
compound, 54 
defined, 54 
dependent, 85 
disjoint, 56, 58-59, 62, 98 
exhaustive, 80-81 
independent, 85-91 
intersection of, 55 
mutually exclusive, 56 
mutually independent, 87-89 
null, 56, 58 


probability and, 54-55 
set theory, relation to, 55-56 
simple, 54 
union of, 55-55 
Exceedance probability, 694 
Expected cell counts, 629-630, 631-633, 635 
Expected mean squares, 251, 443-444 
Expected value, 109-114, 213-214 
continuous random variable, 98, 152-154, 213 
covariance, 214-216 
of difference, 362—363 
discrete random variable, 109-117 
of a function, 112-113 
of a linear function, 113-114, 491 
tules of, 113 
variance and, 113-114, 115-116 
Experiment, 53 
binomial, 118-119, 120, 620 
censoring/uncensored, 259 
defined, 53 
double-blind, 395 
factorial, 469-474 
multinomial, 207, 620 
paired vs. unpaired, 387-388 
pictorial, 68 
randomized block, 444-447 
randomized controlled, 366 
sample space of, 53-55 
screening, 469 
simulation, 222, 225-229 
trinomial, 206 
Experiment-wise error rate, 355, 424 
Explanatory variable, 488 
Exponential distribution, 170-172 
confidence interval, 277, 282-284 
defined, 170 
hypothesis test, 170-172 
memoryless property of, 172 
point estimation, 258-260, 265 
Poisson process and, 171-172 
Exponential regression model, 555, 597 
Exponentially weighted moving-average control 
chart, 715 
Exponential smoothing, 50-51 
Extrapolation, danger of, 499 
Extreme outlier, 14, 16, 41, 42-43 
Extreme value distribution, 190-191, 195 


F 


Factorial experiments, 469-483 
2? experiments, 473-474 
23 experiments, 469-474 
Factorial notation, 70, 471 
Factors, 409 
Failure rate function, 196 
Family error rate, 424 
Family of probability distributions, 103 
F distribution, 414-416 
critical values, 414-415, A-14-A-19 
degrees of freedom, 399-400 
F test, 399-402, 414-416, 429 
noncentral, 427-428 
single-factor ANOVA and, 409, 410-420, 
426-435 
two-factor ANOVA and, 399-403 
Finite population correction factor, 128 
First-order multiple regression models, 572-573, 
586-587 
Fisher, R. A., 66, 266, 532-533 
Fisher—Irwin test, 266, 397, 532-534 
Fisher transformation, 534 


Index 1-3 


Fitted values, 500 
Fixed effects model, 432, 439-443, 460-464, 
451-452 
single-factor ANOVA, 409, 410-420, 426-435 
two-factor ANOVA, 438-459 
three-factor ANOVA, 460-464 
Forward selection method, 602 
Fourth spread, 40-41, 44, 220, 221 
Fractional replication, 477-480 
Fraction-defective data, 696-697 
Frequency, 16 
cumulative, 29 
relative, 16-18, 60 
Frequency distribution, 17, 24 
Friedman test, 673-674 
F tests, 414-416 
B for, 427-429 
distributions and, 414-416 
equality of variances, 399-402 
group of predictors, 584-585 
multiple regression, 566, 584-587, 595 
population treatments, 410, 413 
P-values for, 402 
simple linear regression, 487, 516 
single-factor ANOVA, 409, 410-420, 
426-435 
t tests and, 429 
Full estimators, 632 
Fundamental identity, 417 
Fundamental Theorem of Calculus, 150 
Future value, prediction of, 299-300, 519-527 
F(x), to compute probabilities, 149-150 
obtaining f(x) from, 150 


G 


Galton, Francis, 505-506 
Gamma distribution, 172-173 
point estimation, 265 
standard distribution, 173 
Gamma function, 173-174 
incomplete, 173-174, A-8 
Gauss, Carl Friedrich, 496 
Gaussian distribution, 323 
General additive multiple regression model 
equation, 555-557, 572 
Generalized interaction, 476 
Generalized negative binomial distribution, 130 
Geometric distribution, 129-130 
Geometric random variable, 129-130 
Goodness-of-fit tests, 619-639 
category probabilities and, 620-627 
composite hypotheses and, 627-639 
continuous distributions and, 625-626, 
633-636 
discrete distributions and, 631-633 
normality and, 636-637 
Grand mean, 412, 441 
Grand total, 416 
Graph, line, 101-102 
Greco-Latin square design, 486 


H 


Half-normal plot, 193 
Half-replicate, 477 
Heavy tails, 111, 116, 189, 258, 547, 659, 665 
Histogram, 5f, 16-23, 24f 
bimodal, 22 
binomial probability, 165 
continuous data, 19-22 
density, 20-22 
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1-4 Index 


Histogram (continued) 
discrete data, 17 
multimodal, 22 
negatively skewed, 23 
positively skewed, 23 
probability, 102 
shape of, 22-23 
smoothed, 22 
symmetric, 23 
unimodal, 22 
Hodges—Lehmann estimator, 274 
Homogeneity, 640-643 
null hypothesis of, 641 
testing for, 640-643 
Homogenous populations, 118 206, 498, 643 
Hyperexponential distribution, 195 
Hypergeometric distribution, 126-128 
and binomial, 117-122 
Hypothesis, 310 
alternative, 311-312 
composite, 627-639 
defined, 311 
null, 311, 462, 641 
researcher’s, 312 
simple, 627-628 
statistical, 311, 352-353 
Hypothesis testing, 310-360 
Ansari—Bradley test, 677 
aspects of, 352-360 
confidence intervals and, 353-354 
correlation coefficient, 527-532 
difference in means, 362-373, 378 
difference in proportions, 391-398 
distribution-free, 356, 652-677 
errors in, 317-323 
explanation of, 643 
exponential distribution, 170-172 
Fisher—Irwin test, 266, 397, 532-534 
Friedman test, 673-674 
goodness of fit, 627-639 
homogeneity of populations, 640-643 
independence of factors, 643-646 
introduction, 310 
issues related to, 352, 595-610 
Kruskal—Wallis test, 671-673 
large-sample, 331-333, 346-349, 367-369, 
392-394, 396-397, 658 
likelihood ratio principle, 355-356 
lower-tailed, 328, 364 
McNemar test, 398, 408 
mean difference, 362-373, 378 
multiple regression, 566-567, 580-582 
normal distribution, 311, 312—317 
one-sample t, 335-346 
paired t, 383-387 
Poisson distribution, 131-136 
polynomial regression, 562-571 
pooled t, 377-378 
population mean, 326-334, 362-373, 378 
population proportion, 346-352 
power in, 340-341 
procedures for, 311-326 
P-values and, 312-317, 341-344, 364 
Ryan-Joiner, A-23 
sample-size determination, 338-341, 
366-367 
Siegel-Tukey test, 677 
significance level, 316-317, 319 
sign test, 653-654 
simple linear regression, 514-516 
small-sample, 349-350, 397 
steps in, 329 
test statistic, 316-317 
two-sample t, 374-379 
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two-tailed, 328, 364 
type II error probability, 317-323, 331, 350- 
351, 378-379, 394-395 

upper-tailed, 336, 364 

variance, 363-365 

Wilcoxon rank-sum test, 661-666, A-25 

Wilcoxon signed-rank test, 653-661, A-24 
Hypothetical population, 7 


I 


Incomplete gamma function, 173-174, A-8 
Incomplete layout, 464 
Independence, 85-91 

of events, 85-91 

multiplication rule and, 86-87 

mutual, 87-89 

probability and, 85-91 

testing for, 643-646 
Independent events, 85-86 
Independent random variables, 204-205, 207 
Independent variable(s), 488 
Indicator variable, 574 
Inferential statistics, 5-6 
Influential observations, 604-605 
Interaction, 451 

generalized, 476 

two-factor, 460-463 

three-factor, 460-461 

quadratic predictor, 572-574 
Interaction parameters, 451-452 
Interaction sum of squares, 452 
Interquartile range (IQR), 40-41, 

44, 688 

Intersection of events, 55 
Interval estimate. See Confidence interval 
Interval 

class, 18-20 

confidence, 281—282 

prediction, 299-300, 523-524 

random, 278 
Intrinsically linear function, 550-551 
Intrinsically linear model, 551-552 
Invariance principle, 270 


J 


Jensen’s inequality, 197 
Joint confidence level, 424, 522 
Joint density function, 201 
Jointly distributed random variables, 
198-212 
conditional, 209 
independence of, 204—205 
more than two, 205—208 
two continuous, 201-204 
two discrete, 199-201 
Joint marginal density function, 212 
Joint probability density function, 201 
Joint probability distributions, 
198-199, 211 
Joint probability mass function, 199 
Joint probability table, 199-200, 
204, 205 


K 


Kemp nomogram, 706 
k-out-of-n system, 137 
k-predictor model, 600, 602 
Kruskal—Wallis test, 671-673 
kth moment of the distribution, 264 
kth population moment, 264—265 


kth sample moment, 264 
k-tuple, 68-69 


LE 


Lack-of-fit test, 559 
Large-sample confidence intervals, 285-294, 
369-371 
Large-sample hypothesis tests, 331-333, 346— 
349, 367-369, 392-394 
Large-sample confidence bound, 291-292 
Latin square designs, 464-466 
Law of total probability, 80-82 
Least squares estimates, 496-503 
weighted, 547 
Least squares line, 496-505 
Least squares principle, 496-497 
Level a test, 331 
Level of significance, 316, 319, 353 
Levels of the factor, 409 
Light tails, 189 
Likelihood function, 268-269 
Likelihood ratio principle, 355-356 
Limiting relative frequency, 60 
Linear combination, 238-243 
distribution of, 238-243 
Linear probabilistic model, 491-496 
Linear relationship, 217 
correlation and, 217-218 
p measuring degree of, 217 
Line graph, 101-102 
Line of mean values, 492 
Location 
control charts for, 681-690 
measures of. See Measures of location 
Location parameter, 157, 176, 178 
Logistic regression, 557—560 
Logit function, 557-558 
Lognormal distribution, 179-181 
Lot tolerance percent defective, 710 
Lower fourth, 40-41 
Lower-tailed test, 328 
LOWESS method, 556 


M 


MAD regression, 547 
Main effects, 451-452 
Mann-Whitney test, 661 
Marginal probability density function, 
202-203 
Marginal probability mass function, 200-201 
Maximum likelihood estimation, 266-270, 
355-356 
complications, 271-272 
large-sample behavior of, 271 
likelihood ratio principle, 355-356 
Maximum likelihood estimator, 266-270, 
355-356 
McNemar test, 398, 408 
Mean, 29-31 
confidence interval, 362—373, 378 
correction factor for, 128, 417 
deviations from, 36-38 
grand, 412, 441 
as measure of location, 29-31 
outliers influencing, 31 
population, 30-31, 362-373, 378 
sample, 29-31 
standard error of, 230-231 
trimmed, 32-33, 258 
values, line of, 492 


of a random variable, 122-123 
Mean square error (MSE), 274 
Mean square for treatments (MSTr), 413-414 
Mean squares, expected, 443-444, 461 
Mean value, 109-114, 152, 255 
Measurement error, 185-186, 234, 251, 491 
Measures 
of location, 29-36 
of variability, 36-38 
Median, 31-32, 152 
Memoryless property, 172 
M-estimator, 272 
Method of moments, 264-266 
Midfourth, 49 
Midrange, 49 
Mild outlier, 16, 42 
Minimum variance unbiased estimator, 255-257 
Mixed effects model, 448, 456-457 
Mixed exponential distribution, 195 
Mode, 49 
Model adequacy assessment, 543-550 
Model equation, 426-429 
simple linear regression, 491-492 
single-factor ANOVA, 409, 410-420, 
426-435 
Model utility test, 580-582 
multiple regression, 504—505 
simple linear regression, 514-516 
Moment estimators, 264-265 
Moments, method of, 264-266 
Multicollinearity, 606 
Multifactor ANOVA, 437-486 
expected mean squares, 443-444 
experiment analysis, 469-483 
fixed effects model, 432, 439-443, 
451-452, 456-457, 460-464 
introduction, 437 
Latin square designs, 464-466 
mixed and random effects, 448, 456-457 
multiple comparisons procedure, 420-426, 
444, 455 
random effects model, 432-433, 456-457 
randomized block experiment, 444-447 
test procedures, 452-455, 460-464 
three-factor ANOVA, 460-469 
two-factor ANOVA, 438-459 
Multimodal histogram, 22 
Multinomial distribution, 207 
Multinomial experiment, 207, 620 
Multiple comparisons procedure, 420-426, 444, 
455 
multifactor ANOVA, 437-486 
single-factor ANOVA, 409, 410-420, 
426-435 
Multiple correlation coefficient, 579 
Multiple regression, 504—505, 572-594 
confidence intervals, 512-514 
coefficient of multiple determination in, 
503-505, 565, 578 
F test for predictor group, 516, 584-585 
general additive model equation, hypothesis 
tests, 555-557, 572 
inferences in, 582-587 
influential observation, 604-605 
model adequacy assessment, 543-550, 
587-588 
models with predictors, 519-527, 572-576 
model utility test, 514-516, 580-582 
multicollinearity, 606 
other issues in, 595-610 
parameter estimation, 496-509, 563-566, 
576-580 
prediction interval, 523-524, 523-524 
standardizing variables, 598-599 


transformations, 532-533, 550-562, 595-598 
variable selection, 599-603 
Multiplication rule for probabilities, 77-80, 
86-87 
Multiplicative exponential model, 551 
Multiplicative power model, 551, 552, 595 
Multivariate data, 4, 24 
Mutually exclusive events, 56 
Mutually independent events, 87-89 


N 


Negative binomial random variable, 128-130 
Negatively skewed histogram, 23 
Nomogram, 706-707 
Noncentral F distribution, 427-428, 429 
Noncentrality parameter, 427-428 
Nonhomogeneous Poisson process, 139-140 
Nonlinear regression, 550-562 
Nonnormal population distribution, 301-302 
Nonparametric procedures. See Distribution-free 
test procedures 
Nonstandard normal distributions, 161-163 
Normal distribution, 156-170 
binomial distributions and, 165-166 
bivariate, 218-219 
Central Limit Theorem and, 232-235 
chi-squared test, 174-175 
confidence intervals and, 286, 295-304 
critical values notation (z,) 160-161 
discrete populations and, 164-165 
hypothesis tests and, 311-360 
of a linear combination, 240-241 
nonstandard, 161-163 
percentiles of, 163-164 
point estimation and, 249-250, 254, 257-258, 
259 
population, 231, 327-331 
probability plots and, 184-193 
sample mean and, 231-232 
standard, 158-159 
tolerance critical values for, A-10 
Normal equations, 497 
Normality, 636-637 
checking, 636-637 
Ryan-Joiner test for, A-23 
Normalized expected total error of estimation, 
600 
Normal probability plot, 187-189, 235 
Normal random variable, 158, 240-241 
Null event, 56 
Null hypothesis, 311-312, 619-620, 640, 641, 
643-644 
Null value, 312 
Number-defective data, 697-698 


O 


Objective interpretation of probability, 60-61 
Observational studies, 365 
Observations 
influential, 604-605 
retrospective, 366 
Observed cell counts, 621-622 
Observed significance level (OSL), 323-324, 
645, 648 
Odds, 558-559 
Odds ratio, 559, 597-599 
One-sided confidence intervals, 291-292 
One-tailed test 
lower-tailed, 336 
upper-tailed, 336 
One-way ANOVA, 427-428 


Index 1-5 


Operating characteristic (OC) curve, 709-710 
Ordered pairs, product rule for, 67-68 
Outlier 

boxplot showing, 40-41 

extreme, 14, 16, 41, 42-43 

mild, 16, 42 


P 


Paired data, 382-391 
Paired experiment, unpaired v., 387-388 
Paired ¢ procedures 
confidence interval, 385-386 
hypothesis test, 383-385, 386-387 
Parameter estimation, 496-510 
in chi-squared tests, 628-631 
control charts based on, 683-685 
of a function, 270-271 
multiple regression, 504—505, 576-580 
polynomial regression, 563-566 
simple linear regression, 496-510 
using least squares, 497-498 
See also Point estimation 
Parameter(s), 451-452 
fixed effects, 451-452 
generic symbol for, 
interaction, 451-452 
location, 683-685 
noncentrality, 427-428 
of a probability distribution, 103, 624-625 
scale, 157, 177, 190-191 
shape, 173, 177, 191 
Parametric function, 424—425 
Pareto diagram, 29 
Partial residual plot, 580, 587 
p control chart for fraction defective, 696-697 
Percentile, 32—33 
continuous distribution, 150-152 
normal distribution, 163-164 
sample, 184-185 
Permutations, 69-73 
Point estimate, 31, 221, 247 
Point estimation, 247—275 
bootstrap method, 260-261 
Cauchy distribution, 257—258 
censoring procedure, 258-259 
correlation coefficient, 530—532 
defined, 221, 224 
exponential distribution, 258-260, 265 
functions of parameters, 270-271 
gamma distribution, 265 
general concepts, 248-264 
introduction to, 247 
invariance principle, 270 
least squares method, 496-506 
maximum likelihood, 250, 266-270, 355-356 
methods of, 264—274 
method of moments, 264-266 
minimum variance unbiased, 255—257 
normal distribution, 249-250, 254, 257-258, 
259 
Point estimator, 248-249, 251, 259-260 
biased, 251-253, 255, 257 
bootstrap, 260-261 
complications, 257—259, 271-272 
consistent, 274 
defined, 248 
Hodges—Lehmann, 274 
large sample behavior, of the MLE, 271 
maximum likelihood, 266-270 
mean squared error, 274 
M-estimator, 272 
with minimum variance, 255-257 
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1-6 Index 


Point estimator (continued) 
moment, 264-265 
pooled, 263 
reporting, 259-261 
robust, 258, 272 
standard error of, 259-261 
unbiased, 251-255 
Point prediction, 299 
Poisson distribution, 131-136 
binomial distribution and, 117-125, 133 
confidence intervals and, 374 
data transformations and, 432 
exponential distribution and, 170-172 
goodness of fit, 619, 628, 631-632 
hypothesis testing and, 358 
as limit, 132-133 
mean and variance, 134 
point estimation and, 269 
rationale for using, 132-133 
tables, A-4—A-5 
Poisson probabilities, cumulative, 
A-4-A-5 
Poisson process, 134-135 
exponential distributions and, 170-172 
nonhomogeneous, 139-140 
Polynomial regression, 562-571 
centering x values, 567-568 


coefficient of multiple determination, 565, 


578 
model equation, 562, 567 
parameter estimation, 563-566 
statistical intervals, 566-567 
test procedures, 566-567 
Pooled estimator, 263, 377-378 
Pooled t procedures, 377-378 
Population, 1-7 
conceptual, 7 
defined, 3 
discrete, 164-165 
hypothetical, 7 
mean, 30-31, 362-373, 378 
median, 31—32 
normal distribution, sample mean and, 
231-232 
standard deviation, 38-39, 
304-306 
target, 10 
variance, 304—306 
Positively skewed histogram, 23 
Posterior probability, 80-82 
Power, 340-341 
curves, 341, 428 
Power model, 551-552, 597 
Practical significance, 352-353 
Precision, 281-282 
Predicted values, 500 
Prediction interval, 299-300, 523-524 
Prediction level, 300 
Predictor variables, 488 
Principal block, 476-477 
Principle of least squares, 496-497 
Prior probability, 80-82 
Probability, 6, 52-94 
axioms of, 58-59 
conditional, 75-80 
counting techniques and, 66-75 
coverage, 290 
defined, 52, 76-77 
determining systematically, 63 
error, 122, 319 
exceedance, 694 
equally likely outcomes and, 63-64 
histogram, 102, 143f 
inferential statistics and, 5-6 
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interpretation of, 59-61 
law of total, 80-82 
limits, control charts based on, 
685-686 
multiplication rule of, 77-80, 86-87 
posterior, 80-82 
prior, 80-82 
properties of, 61-63 
statistics v., 6-9 
Probability density function, 142-147 
conditional, 209 
joint, 201-202 
marginal, 202—203 
symmetric, 399-400, 621, 653-654 
Probability distribution, 100, 143 
Bernoulli, 97 
beta, 181-182 
binomial, 126-131 
bivariate normal, 218-219 
Cauchy, 257-258 
chi-squared, 174-175, A-21-A-22 
conditional, 75-80 
continuous, 98, 142-146 
discrete, 98-103 
exponential, 170-172 
F, 399-403, 414-416, 580-581 
family, 103 
gamma, 172-174, 265 
geometric, 103, 129 
hypergeometric, 126-128 
joint, 205, 208, 218, 271, 355 
kth moment of, 264, 265 
of a linear combination, 238-243 
lognormal, 179-181 
multinomial, 207, 620 
negative binomial, 128-130 
normal, 156-170 
parameter of, 103 
Poisson, 131-136 
of a sample mean, 230-238 
sampling, 222 
standard normal, 158-161 
statistics and, 220-230 
Studentized range, 420-421 A-20 
symmetric, 152 
t, 295-297 
uniform, 144-145, 148f, 149f 
Weibull, 177-179 
Probability histogram, 102, 143 
Probability mass function, 99-102 
cdf and, 104-106 
conditional, 209 
defined, 100 
joint, 199, 267 
marginal, 200-201 
Probability plot, 184-193 
defined, 184 
half-normal, 193 
nonnormal, 189f, 190-192 
normal, 185-190 
sample percentiles and, 184-185 
Process capability index, 680 
Process location, 681-690 
Process variation, 690-695 
Product rule 
general, 68-69 
for ordered pairs, 67-68 
Proportion(s), population 
confidence interval, 285, 289-291 
difference between, 336-337, 
391-398 
hypothesis test, 346-352 
sample, 34, 286-288 
Pure birth process, 274 


P-value, 312-317 
chi-squared test, 624-625 
defined, 316-317 
F test, 400-403 
interpreting, 312-323 
normal population variance, 364 
as a random variable, 324, 341, 343 


rejection region v., 314, 325, 328, 338, 344, 


352-353, 669 
t test, 312-317 
variations, 341-344 
z test, 326-334 


Q 


Quadratic regression, 572-574 

Qualitative data, 23-24 

Quality control methods, 678-715 
acceptance sampling, 708-714 
control charts, 679-700 
CUSUM procedures, 700-708 

Quartiles, 32-33 


R 


Random deviation, 491 
Random effects model, 432-433, 456-457 
multifactor ANOVA, 437-486 


single-factor ANOVA, 409, 410-420, 426-435 


Random error term, 491 
Random interval, 278 
Randomized block experiment, 444-447 
Randomized controlled experiment, 366 
Randomized response technique, 264 
Random sample, 222 
Random variable(s), 96-99 
Bernoulli, 97 
binomial, 119-121 
continuous, 98, 201-204 
correlation coefficient of, 216 
covariance between, 214—216 
dependent, 204 
difference between, 239-240 
discrete, 199-201 
expected value of, 109-117 
geometric, 129 
independent, 204—205, 207 
jointly distributed, 199-212 
lognormal, 179-181 
more than two, 205-208 
negative binomial, 128-130 


normally distributed, 195, 240, 243, 299- 


300, 426, 448, 456, 562 
standard normal, 158 
uncorrelated, 216, 239 
variance of, 664 
Weibull, 177-178, 181, 221 

Range, 36 
Rayleigh distribution, 146, 263, 274 
R control chart, 692-694 
3-sigma control limits, 693 
Rectification, 712-713 
Regression 
analysis, 487-488, 505-506 
ANOVA and, 516 
calibration and, 326, 346, 656 
coefficients, 503-505, 527-530 
cubic, 563, 565, 570-571 
effect, 506 
exponential, 555, 597 
function, 542, 548, 550, 553, 555, 562, 

563-564 

influential observations, 604-605 


intrinsically linear, 550-552 

line, 491 

logistic, 557-560 

LOWESS, 556 

model adequacy, 543-550, 587-588 

multicollinearity, 606 

multiple, 504-505 

nonlinear, 542-557, 595 

polynomial, 562-571 

power, 550-553 

quadratic, 572-574 

residual analysis, 543-544 

simple linear, 487-496 

through the origin, 255 

transformations, 532-534, 550-562, 595-598 

true regression coefficients, 572 

true regression function, 550, 562 

true regression line, 491-493, 496-500 

variable selection, 599-603 
Regression analysis, 487-488, 505-506 
Regression coefficients, 503-505, 527-530 
Regression effect, 506 
Regression line 

estimated, 491-493 

true, 491-493, 496-509 
Regression sum of squares, 505, 578, 580 
Relative frequency, 16-18, 60 
Repeated-measures design, 446 
Replication, fractional, 477-480 
Researcher’s hypothesis, 312 
Residual analysis, 543-544 
Residual plots, 544-545 
Residuals, 500, 543-544 

standardized, 543-544 

sum of squared, 502 
Response variables, 488 
Restricted model, 456, 468 
Retrospective observational study, 366 
Robust control charts, 688 
Robust estimator, 258, 272 
Ryan—Joiner test, A-23 


S 


Sample(s), 3, 6 
convenience, 10, 406 
defined, 3 
simple random, 10, 222 
stratified, 10 
variability, 36-38 
Sample coefficient of variation, 48, 332 
Sample correlation coefficient, 527-530 
Sample mean, 230-238 
Sample median, 31-32 
Sample moment, 264—265 
Sample percentile, 184-185 
Sample proportion, 34 
Sample size, 13 
confidence intervals and, 281-282, 396, 397 
hypothesis tests and, 330-331 
single-factor ANOVA and, 430-432 
small-sample inferences and, 366-367, 397 
type II errors and, 317-323, 350-351, 378- 
379, 394-395 
Sample space, 53-54 
Sample standard deviation, 684-685 
Sample variance, 37-38 
computing formula, 39-40 
motivation for, 38-39 
Sampling 
frame, 9 
variability, 189, 276, 301, 318, 332, 338, 
510, 519 


Sampling distributions, 222—225 
approximate, 226 
deriving, 222-225 
random, 222 
sample mean and, 230-238 
simulation experiments and, 225-229 
Sampling frame, 9 
Scale parameter, 157, 177, 190-191 
Scatterplot, 488 
S control chart, 691-692 
3-sigma control limits, 691 
Score confidence interval, 289-290 
Second-order multiple regression model, 
572-573, 585, 586-587, 598, 616 
Set theory, relationship to events, 55-56 
Shape parameter, 173, 177, 191 
Siegel—Tukey test, 677 
Signed-rank sequences, 653-655 
Significance 
level, 316-317 
observed level of (OSL), 323-324 
practical vs. statistical, 352-353 
probability and, 319 
Sign interval, 676 
Sign test, 653-654 
Simple event, 54 
Simple hypothesis, 627-628 
Simple linear regression, 487-527 
coefficient of determination in, 503-505 
estimating model parameters in, 
496-503 
hypothesis-testing procedure, 514-516 
inferences based on, 510-527 
introduction, 487-491 
linear probabilistic model, 491-496 
scope of, 505-506 
terminology, 505-506 
Simple random sample, 10, 222 
Simulation experiment, 225-229 
Simultaneous confidence level, 354—355 
Single-factor ANOVA, 409, 410-420, 
426-435 
data transformation and, 431-432 
explanation of, 410-411 
fixed effects model, 432, 439-441 
F distributions and, 414-416 
F test, 414-416, 429 
model equation, 426-429 
notation and assumptions, 412-413 
random effects model, 432-433 
sample sizes, 430-431 
sums of squares, 416-419 
test statistic, 413-414 
Single-sampling plans, in acceptance sampling, 
709-710 
Skewed distribution, 170, 189, 694 
Skewness, 41, 42, 229, 235-236, 380 
Slope, 510-512 
confidence interval, 512-514 
hypothesis-testing procedure, 514-516 
Standard beta distribution, 181 
Standard deviation, 114, 304-306 
confidence interval, 304—306 
continuous random variable, 154 
discrete random variable, 114 
population, 38-39 
sample, 37, 221, 248 
Standard distribution, 190 
Standard error, 259-261 
Standard gamma distribution, 173 
Standardized independent variable, 571 
Standardized residual, 543-544 
Standardized variable, 161, 598-599 


Index I-7 


Standardizing, 161-163, 165 
in regression, 543-544 
Standard normal curve, A-6—A-7 
Standard normal distribution, 158-161 
curve areas, 159 
defined, 158 
percentiles of, 159-160 
z, notation and, 160-161 
Standard normal random variable, 158 
Standard order, 471-472 
Standard sampling plans, for acceptance 
sampling, 713 
Statistic, 1, 3, 9, 10 
distribution of, 220-230 
test, 311-326 
See also Data, collecting 
Statistical hypothesis, 311, 352-353 
Statistical intervals, 276-309, 566-567 
introduction, 276 
Statistical significance, 352 
Statistics, 220-230 
branches of, 4—7 
descriptive, 4-5 
enumerative vs. analytic, 9-10 
inferential, 5-6 
probability vs., 9-10 
role of, 1-3 
scope of, 7-9 
software packages, 15, 121 
Stem-and-leaf display, 5, 13-15 
comparative, 25-26 
Step function, 105-106 
Stepwise regression, 602 
Stratified sampling, 10 
Stress ratio, 506 
Studentized range distribution, 420-421 A-20 
Subjective interpretation of probability, 
61, 65 
Sum of squares, 416-417 
ANOVA, 417 
error, 416-417, 502 
interaction, 461 
regression, 502 
total, 486, 504, 516, 578 
treatment, 416-417, 504 
Symmetric distribution, 152, 657 
Symmetric histogram, 23 


. 


Tabular methods, 13-29 

Taguchi methods, 467, 480, 678-679 

Target population 

t critical value, A-9 

t curve tail areas, A-12—A-13 

t distribution, 295-297 
critical values, A-9 
curve tail areas, A-12—A-13 
properties, 10 

Test of hypotheses, 326-334. See also 

Hypothesis testing 

Test statistic, 313-317 

Three-factor ANOVA, 460-469 
experiment analysis, 469-474 
fixed effects model, 432, 439-441 
Latin square designs, 464-466 

Time series, 50 

T method, 420-423 

Tolerance critical values, for normal population 

distributions, A-10 

Tolerance intervals, 300-301 

Total probability law, 80-81 

Total sum of squares, 416-417, 504, 516 
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1-8 Index 


Transformation 
ANOVA, 431-432 
control chart, 698-699 
regression, 550-562, 595-598 
Treatments, mean square for, 413-414, 417, 
422, 445 
Treatment sum of squares, 416-417, 504 
Tree diagram, 67-68 
Trials, 99, 117-119, 235-236 
Trimmed mean, 32-33, 258 
True regression coefficients, 572 
True regression function, 550, 562 
True regression line, 500 
t tests, 
F tests and, 400-401, 428, 429 
one-sample, 297-299, 335-340 
paired, 383-385 
pooled t, 377-378 
P-value for, 341-344 
two-sample, 374-382, 386-387 
Wilcoxon rank-sum test and, 665 
Wilcoxon signed-rank test and, 657, 659-660 
Tukey’s procedure, 420-424 
Two-factor ANOVA, 438-459 
expected mean squares, 443-444 
fixed effects model, 432, 439-441, 451-452 
mixed effects models and, 448, 456-457 
multiple comparisons procedure, 420-426, 
444, 455 
random effects model, 432-433, 448, 
456-457 
randomized block experiments, 444-447 
test procedures, 441-443, 452-455 
See also Multifactor ANOVA 
Two-sample t procedures, 361—408 
confidence interval, 374-379 
degrees of freedom for, 399-400 
test of hypotheses, 362-374 
Two-tailed test, 328, 364 
Two-way contingency table, 639-648 
chi-squared tests and, 641-648 
defined, 640 
testing for homogeneity, 640-643 
testing for independence, 643-646 
Type I error, 317-323 
probability of, 319 
See also Significance level 
Type II error, 317-323, 331, 350-351 
sample size and, 317-323, 350-351, 
366-367, 394-395 
two-sample f test and, 378-379 
u control chart, 700 


U 


Unbiased estimation, principle of, 253-255 
Unbiased estimator, 251-255 

minimum variance, 255-257 
Unbiasedness, 251-255 
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ee 


Ge ee ee ec ee Se 


ncorrelated random variables, 216 

nderscoring, 421-522, 423, 431, 444, 445, 455, 
466 

nequal class widths, 20 

nequal sample sizes, 430-431 

niform distribution, 144-145, 148f, 149f 

nimodal histogram, 22 

nion of events, 55 

nivariate data, 3 

nrestricted model, 456, 457 

pper fourth, 40-41 

pper-tailed test, 336 


V 


Variability measures, 36—47 
Variables, 95-117 


categorical, 574-576 
coded, 598 

continuous, 98, 142-146 
defined, 3 

dependent, 488 

discrete, 98 

dummy, 574 
explanatory, 488 
independent, 488 
indicator, 574 

predictor, 488 

random, 96-99, 239-240 
response, 488 
standardized, 598-599 
standard normal random, 158 
transformed, 550-562 
uncorrelated, 216 

types of, 98 


Variable selection, 599-603 


backward elimination, 602 
criteria for, 599-600, 601 
forward selection, 602 
stepwise, 602 


Variance, 113-114 


confidence interval, 304-306, 402 

continuous random variable, 98, 154 

defined, 114 

discrete random variable, 98, 113-114 

expected value and, 109-113 

F test for equality of, 399-402 

hypothesis test, 363-365 

of a linear combination, 239-240 

normal populations with known, 
363-365 

pooled estimator of, 263, 377-378 

population, 304-306 

rules of, 113-114 

sample, 37-38 

shortcut formula for 02, 114-116 

two-factor, 399-403 


Variation 


coefficient of, 48, 183, 332 


control charts for, 690-695 
P-values, 341-344 

Venn diagram, 56 

V-mask, 700-703 


W 


Weibull distribution, 177-179 
distribution samples, 221 
point estimation, 221 
probability plot, 187, 189, 190-191 
Weibull random variable, 221 
Weighted least squares, 547 
Wilcoxon rank-sum interval, 669-671, 
A-27-A-28 
Wilcoxon rank-sum test, 661-666, A-25 
critical values for, A-25 
general description of, 663-664 
development of, 661-663 
efficiency of, 665-666 
large-sample approximation, 658-659 
normal approximation, 664-665 
Wilcoxon signed-rank interval, 667-669, 
A-26 
Wilcoxon signed-rank test, 653-661 
critical values for, A-24 
efficiency of, 659-660 
general description, 655-656 
large-sample approximation, 658-659 
paired observations and, 657-658 
Without-replacement experiment, 119 


X 


X control chart, 681-686, 688 
estimated parameters and, 683-686 
known parameter values and, 681-683 
probability limits and, 119 
supplemental rules for, 688 


Y 


Yates’s method, 472, 473, 475, 478 


z 


z, hotation, 160-161 
zcurve, 158-159 
z test, 326-334 

large-sample, 331-333, 368 

normal population distribution with known 

o, 327-331 

one-sample, 297, 335-336 

population mean, 326-334, 362-373, 378 

P-value for, 326-327 

two-sample, 362-371 


